0% found this document useful (0 votes)

29 views12 pages

Multilevel Models Applications Using SAS® - (1 Introduction)

Multilevel models have become increasingly popular in various research fields over the past two decades, allowing for the analysis of hierarchically structured data that incorporates both micro and macro levels. These models address the limitations of traditional statistical methods by accounting for observation dependence within groups, enabling researchers to explore complex relationships between individual and group-level variables. The document discusses the conceptual framework, data structures, and variable considerations necessary for effective multilevel modeling in social sciences.

Uploaded by

calvinjensbotha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views12 pages

Multilevel Models Applications Using SAS® - (1 Introduction)

Uploaded by

calvinjensbotha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Chapter 1

Introduction

Over the past two decades multilevel models (Mason et al., 1983; Bryk & Raudenbush, 1992;
Raudenbush & Bryk, 2002; Goldstein, 1987, 1995) have gained popularity in various research
fields including education, psychology, sociology, economics, and public health. Multilevel
models extend ordinary least square (OLS) regression to analyze multiple level data or hierar-
chically structured data that involve both micro and macro observation information. Multilevel
models also appear under different names in the literature, including hierarchical linear models
(Bryk & Raudenbush, 1992; Raudenbush & Bryk, 2002), random-effect models (Laird & Ware,
1982), random coefficient models (DeLeeuw & Kreft, 1986), variance component models
(Dempster et al., 1981), mixed models (Longford, 1987), and empirical Bayes models (Strenio et
al., 1983).
Prior to the development of formal statistical methodology for multilevel models, sociolo-
gists engaged in contextual or multilevel analysis of hierarchically structured data. In the late
1950s and early 1960s Lazarsfeld (1961) and Merton (1957) at Columbia University began to
assess contextual effects on individual behavior. The 1970s witnessed a significant jump in
analysis of multilevel data in education (Barr & Dreeben, 1977; Block & Burns, 1976; Bronfen-
brenner, 1976; Burstein, 1980; Cronbach, 1976; Herriot & Muse, 1973; Pedhazur, 1975; Snow,
1976; Spady, 1973; Walberg, 1976). In a systematic study of contextual analysis, Boyd and
Iversen (1979) discussed how to model multilevel data with micro-macro models, i.e., to formu-
late within group regression model at individual level, then relate the within group regression
coefficients to contextual variables that describe the groups. Although multilevel observations
are discussed in their models, their estimation was conducted using ordinary least square (OLS)
techniques that were inappropriate for multilevel analysis.
Statistical theories of multilevel models and corresponding computer programs were devel-
oped in early 1980s by sociologists and demographers. Models were applied to analyze the large
scale multilevel data of the United Nation’s World Fertility Survey (WFS) (Hermalin & Mason,
1980; Mason et al., 1983). Further methodological and substantive work in educational studies
and the user-friendly windows-based computer programs by Bryk & Raudenbush (1992) and
Goldstein (1987, 1995) have popularized the multilevel models. Multilevel models are now ap-
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

plied in a wide range of studies in the social sciences.

1.1 Conceptual framework of multilevel modeling

A key concept in social sciences is that a society can be described in hierarchical structures. By
hierarchy, we mean that units at a lower level are nested within or grouped into units at a higher
level. People cannot be treated as isolated individuals but as social beings. Individuals are mem-
bers of many different types of groups and are embedded in different social contexts. For exam-
ple, individuals belong to families, neighborhoods, organizations and communities. Awareness
has been mounting that individual behaviors and outcomes are affected not only by individual

Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
2 Chapter 1 Introduction

characteristics, but also shaped by the social contexts in which they are imbedded (Lazarsfeld,
1961; Merton, 1957; Bronfenbrenner, 1976; Blalock, 1984; Iversen, 1991). Davis’ so-called
“frog-pond” theory proposes that individual students evaluate personal ability relative to
in-groups and pay little attention to out-groups (Davis, 1966). A moderately intelligent student
(a medium size “frog”) in a highly intelligent school (a large “pond”) may become discouraged
and thus become an under-achiever, while the same student in a considerably less intelligent
school (a small “pond”) may gain confidence and become an over-achiever. On the contrary, a
moderately intelligent student might be motivated to study harder in a highly intelligent school
and become more successful. In other words, the effect of an individual student’s intelligence on
his/her achievement may be influenced by specific features in the school he/she attends. In addi-
tion to composite measures (e.g., average intelligence level), student academic achievement may
also be influenced by a variety of school level variables such as student/teacher ratio, teachers’
work experience, school facilities, budget, etc.
The relationships between academic achievement and individual level variables vary across
schools. For example, differences in academic achievement among ethnic groups may be larger
in some schools and smaller in others. In such cases the extent of the effect of ethnicity on aca-
demic achievement may relate to identifiable school-level characteristics. On the other hand, the
school’s effect on student academic achievement may also vary among individuals. For exam-
ple, while students usually benefit from smaller student/teacher ratios, these ratios and other
school features are unlikely to influence all types of students equally. Cross-level interactions in
multilevel modeling enable us to assess the degree to which relationships between individual
explanatory and outcome variables are moderated by group level variables.
Good examples of this class of multilevel studies can be drawn from population studies. It is
well-known that fertility levels vary among countries. In general, fertility is low in developed
countries and high in developing countries. Fertility has multi-level determinants. Individual
fertility behavior is determined not only by individual characteristics, such as a couple’s prefer-
ence for children, ethnicity, education, and income at the micro level. Features of the social
contexts or social environments where the individuals live, such as culture or subculture, GDP,
average education level, and in particular, the intensity/efficacy of the family planning programs
(FPP) at the macro-level can also produce measurable effects. Assessing cross-level interactions
is very important in fertility studies. FPP analysts and officers are interested in knowing: What
individual characteristics influence individual fertility behaviors? Do family planning programs
work? How do differences in program implementation among various locations or macro-level
units affect individual fertility behaviors? And, for what classes of people are family planning
programs most effective? Multilevel modeling helps us to gauge how family planning programs
interact with individual characteristics to affect fertility behavior.
Public health studies indicate that individual health behaviors and outcomes are jointly
determined by individual and environmental factors (Von Korff et al., 1992; Duncan et al., 1996;
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

Diez-Roux, 1998; Wang et al., 1998). For example, initiation of smoking among adolescents
may be associated with gender, ethnicity, school achievement and family background, as well as
the social setting in which the individual is imbedded, such as geographic location, prevalence
of smoking, and restrictions on smoking in public areas.
From these examples we can see that research interest in social science studies often centers
on questions like: 1) what and how explanatory variables measured at the individual level affect
the individual-level dependent variable, 2) what and how variables measured at the context or
group level affect the individual-level dependent variables, 3) how the relationships between the
individual-level explanatory and dependent variables vary across contexts or groups, and 4)
what and how group-level variables moderate the effects of individual level variables on the
individual-level dependent variable.

To answer these questions, both micro and macro data are needed. A common challenge in
multilevel data is within-group observation dependence. That is, individuals in the same group
tend to be alike and share similar attitudes and behaviors relative to individuals from other
groups. For example, people living in the same neighborhood may share similarities with each
other because they are influenced by the same neighborhood socio-economic characteristics.
This may be true even for groupings that are only recently established. For example, students
who are in the same school may not be associated with each other before they get into the same
school. Once students enter a school, they become members of the same group. Once groupings
are established, individuals in the group will tend to share traits that differentiate them from
members of other groups. In statistical terms we say there exist within-group homogeneity and
between-group heterogeneity in the hierarchically structured data.
Traditional analytical methods such as Ordinary Least Squares (OLS) Regression assume
that observations are independently, identically distributed (IID). The same assumption is
required for generalized linear models. Violation of this assumption will result in incorrect
inference in statistical analysis. Chapter 2 demonstrates how observation dependence can be
measured using an Intraclass Correlation Coefficient (ICC). Studies show that even a small ICC
can lead to substantial Type-I errors in statistical testing, thus falsely rejecting a true null
hypothesis. Dealing with ICC has been a challenge in statistical analysis of multilevel data for
many years.
Multilevel models provide an appropriate analytical framework to deal with observation
dependence in multilevel data. More importantly, multilevel models permit us to explore the
nature and extent of the relationships at both micro and macro levels, as well as across levels.

1.2 Hierarchically structured data

Hierarchical social structures naturally give rise to hierarchical or multilevel data in which the
lower level units are nested or grouped in the next higher level units. Such hierarchically struc-
tured data exist in many real life situations. The simplest and the most often used multilevel data
are collected at two levels (i.e., one micro level and one macro level). For example, a study on
student academic achievement may collect information at the student level and at the school
level for multilevel modeling.
Multilevel designs can be readily extended to more than two levels. For example, students
are nested in classes, and classes nested in schools; thus observation units lie at three levels of a
hierarchy: the level-1 units are students; the level-2 units are classes; and the level-3 units are
schools. The lowest level units (e.g., students) are the micro-level units or individual units,
while the higher level units are the macro level units or context/group units.
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

Hierarchically structured data may arise in a variety of forms and from a variety of situa-
tions, either observed or by design. Survey data obtained from a complex sampling design are
hierarchically structured. Multi-stage or cluster sampling is conducted to take full advantage of
information from a hierarchy of study units. The first stage or “Primary Sampling Unit” (PSU)
is often a well-defined geographic unit (e.g., county in a state). Once the PSUs are randomly
selected, further stages of random selection are carried out within the PSUs (e.g., districts in a
county) until the final units (e.g., households or individuals) are selected (Kalton, 1983). As a
result, the survey data collected from cluster sampling design have a hierarchical structure in
which individuals are nested within higher level sampling units.
Hierarchical data also frequently arise from experimental designs. For example, clinical
trails may be carried-out in randomly selected clinics or medical centers, thus creating data that

have a hierarchical structure. However, in practice clinics and medical centers are often not ran-
domly selected. This is also true in many multi-site research projects. For example, a national
multi-site research project on public health is often conducted with many project sites located in
different regions, states, or cities. Very often, rather than being randomly selected, project sites
are selected based on the quality of the grant proposals, the level of seriousness of the health
problems under study, or the feasibility of conducting a successful study in a specific site. Al-
though the distribution of the project sites may be carefully taken into consideration, they are
not randomly selected, thus they are not representative of the corresponding higher level units in
the targeted population. As a result, inferences based on the multilevel analysis for non-
randomly selected study sites should be interpreted with caution.
Hierarchical data structures are not confined to cross-sectional settings with multiple units.
Individuals may also be higher level units. For example, in longitudinal or panel studies indi-
viduals are followed up over time. Data are collected repeatedly from the same individuals.
Such longitudinal data can be considered hierarchically structured. The repeated measures for
each individual at different times are level-1 observation units, and individuals become the
level-2 units. A third level can be introduced into the data structure, if the higher level units
(e.g., clinics) in which individuals are nested are available.
Depending on the situation, some individuals may be considered as level-1 units while other
individuals are higher level units. For example, patients and doctors can form a multilevel data
structure. As a doctor treats multiple patients, the doctor may be considered as a level-2 unit,
while patients are level-1 units. Similar situations include teachers and students, coaches and
athletes, as well as interviewers and interviewees.
Finally, a special type of hierarchical data arise from meta-analysis in which results or find-
ings from a series of related studies are summarized quantitatively to assess consistency or
inconsistency of study results (Glass, 1976). In meta-analysis data, individuals are nested within
specific studies. However, it is usually difficult or impossible for researchers to obtain the raw
data from all the studies of interest. As such, a special approach is required for multilevel mod-
eling of meta-analysis data. Detailed examples of formulating multilevel models for meta-
analysis are available in many studies (Goldstein et al., 2000; Raudenbush & Bryk, 2002).

1.3 Variables in multilevel data

In addition to the format of multilevel data, choosing the variables that describe the features of
the distinct levels of the hierarchical structure is another important consideration. For multilevel
analysis, the dependent variable is measured at the individual level and explanatory variables are
measured at both individual and group level or at both micro and macro levels. As in regular sta-
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

tistical analysis, individual explanatory variables usually include socio-demographic characteris-

tics (e.g., gender, ethnicity, education, age) and other measures such as psychological status and
behaviors, depending upon the analyst’s conceptual model.
Contextual variables are macro level measures. They can be aggregate measures, such as
mean values of some individual measures (e.g., average family income) or proportion of indi-
viduals for a particular characteristic within a particular context (e.g., percentage of minority
population). These contextual variables represent the collective social characteristics of con-
texts/groups. They can be derived from either the sample or obtained separately from other
sources such as census or government statistics.
Many contextual variables are not aggregations of individual information. Some characteris-
tics are unique to contexts/groups and cannot be captured at the individual level. For example, in

studies of student school performance, contextual variables could include aggregate measures
such as student gender ratio or average enrollment test scores, and school feature measures such
as school ranking, student-teacher ratio or teacher’s level of experience. The former is an aggre-
gation of student data and can be generated from the sample; the latter represents contextual
aspects of the schools that must be collected from other sources at the school level.
Contextual variables can also be categorical measures. For example, in a multilevel study on
childhood obesity in which children are level-1 units and neighborhoods are the level-2 units.
The researcher may include a dummy variable (1-yes; 0-no) to indicate whether there are fast-
food restaurants in the neighborhood because easy access to fast food may have a significant
impact on children’s diet, thus on their obesity level.
Conceptually, one might use J-1 dummy variables to represent all the contextual features of
the J groups. This approach, however, is not feasible even with a moderately large number of
groups because too many dummy variables would be needed to represent the groups.
The following tables illustrate a fictional two-level data structure. Table 1.3.1 shows the
individual level outcome variable yij and independent variable xij for the ith individual in the jth
group. There are a total of nj individuals in the jth group, and individuals in all the groups sum up
to the total sample size ∑nj = N. zj is a contextual variable describing the group. The values of
the variable zj for specific groups (j = 1, 2, …, J groups) are shown in Table 1.3.2.

Table 1.3.1 Individual level data

Unit Variable
Group Individual yij xij
1 1 5 11
1 2 3 8
   
1 n1 2 7
2 1 6 12
2 2 9 10
   
2 n2 10 15
3 1 11 15
3 2 15 18
   
3 n3 16 20
   
J 1 4 7
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

J 2 5 9
   
J nJ 6 8
Note:
n1, n2, and nj — Number of individuals in the first, second, and the jth groups, respectively. ∑nj=N.
yij and xij—Individual level outcome and independent variable, respectively.

Individual level data and group level data shown in Tables 1.3.1 and 1.3.2 are integrated into
a mixed data set and shown in Table 1.3.3. When merging the data sets, both individual ID and
group ID must be matched for every individual. As such, the same value of the contextual vari-
able zj of group j is assigned to each individual in this group. Consequently, the value of zj does
not vary across individuals within the same group (see Table 1.3.3).

Group zj
1 8.7
2 12.3
3 17.6
 
J 8.0
Note:
zj — Contextual variable at the group level.

Table 1.3.3 Individual and group level mixed data

Unit Variable
Group Individual yij xij zj*
1 1 5 11 8.7
1 2 3 8 8.7
    
1 n1 2 7 8.7
2 1 6 12 12.3
2 2 9 10 12.3
    
2 n2 10 15 12.3
3 1 11 15 17.6
3 2 15 18 17.6
    
3 n3 16 20 17.6
    
J 1 4 7 8.0
J 2 5 9 8.0
    
J nJ 6 8 8.0
Note:
—The same value of contextual variable zj of the jth group is assigned to each individual in the group.
*

The data format for multilevel modeling varies slightly by computer programs. Some pro-
grams require separate individual and groups data sets while others work with mixed data for-
mats like the one shown in Table 1.3.3.
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

1.4 Analytical problems with multilevel data

Prior to the availability of multilevel analytical techniques and computer programs, multilevel
data were analyzed separately at a single level, either the individual level or the group level1 :
Individual level model:
yij = β 0 + β1 xij + ε ij (1.4.1)

1 For simplicity, only one independent variable is included in each model.

Group level model:

y j = γ 0 + γ1x j + ε j (1.4.2)
Equation 1.4.1 is a model at the individual level in which both dependent and explanatory
variables are measured at the individual level. Equation 1.4.2 is a model at the aggregate or
group level in which both dependent and explanatory variables are measured as the mean values
of the corresponding individual level variables. The underlying problem encountered in such an
approach is that it ignores the multilevel structure of the data. Model 1.4.1 ignores group member-
ship and focuses exclusively on individual-level characteristics and inter-individual variation and
thus ignores the potential importance of group-level features in influencing individual-level out-
comes. Another serious problem with this model is that it assumes the independence of observa-
tions. As discussed in Section 1.1, generally individuals within each group are more alike com-
pared with those in other groups. Thus the within-group observations are unlikely to be
independent.
Model 1.4.1 cannot control the intraclass correlation coefficient (ICC), it ignores the within-
group observation dependence, and thus violates the basic assumption underlying traditional re-
gression models. As a result, standard errors of parameter estimates would be biased down-
wards, resulting in a large Type I error — falsely rejecting a true null hypothesis in statistical
significance testing (De Leeuw & Kreft, 1986; Snijders & Bosker, 1999; Hox, 1998, 2002).
Even a small ICC can lead to a Type I errors. (Hox, 1998; Barcikowski, 1981). Consequently,
analyzing multilevel data with traditional regression models can produce misleading conclu-
sions.
Model 1.4.2 focuses exclusively on the inter-group variation and on the data aggregated to
the group level. The group-level model eliminates the observation dependence problem, but
ignores the role of individual-level variables in shaping the outcome on one hand; and on the
other hand, it substantially reduces statistical power by using a group level sample with a much
smaller sample size.
Traditionally, researchers tended to use model results at one level to draw statistical infer-
ences at another level. This has proven incorrect. The results from the two single level models
frequently differ either in magnitude or in sign. The relationships found at the group level are
not reliable predictors for relationships at the individual level, and vice versa. This phenomenon
is known as the ecological fallacy, aggregation bias, or the Robinson effect (Robinson, 1950).
What causes Robinsin effect? Model 1.4.2 analyzes the variation in variable yij at the group
level. Aggregating individual measures changes their meaning. If xij is a continuous measure
(e.g., age), then x j would be the average or mean value of the xij (e.g., mean age) in the jth group
of individuals. If xij is a dichotomous variable, denoting gender (e.g., 1-male; 0-female), then
x j would be the proportion of males in the jth group. Clearly, xij and x j are different measures,
and we should not expect them to have the same effect in separate models based on either indi-
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

vidual or group data.

A critical analytical problem with multilevel data is the heterogeneity of relationships of in-
dependent variables with the dependent variable. The relationship between individual level
dependent and independent variables may vary across groups. For example, suppose we were
studying academic performance of minority students in high schools. The average academic per-
formance score for the minority students may vary across schools. The effect of minority status
on the dependent variable may vary across schools for a variety of reasons. The proportion of
minority students in a school, a “sample composition contextual variable” might partially
account for the variation in performance in addition to other contextual variables at the school
level.

In the past, heterogeneity of micro level relationships was often studied using the following
fixed-effect regression model:
yij = β 0 + β1 xij + β 2 x j + β 3 xij ⋅ x j + ε ij (1.4.3)
where yij denotes the performance score for the ith student in the jth school; xij is a dummy vari-
able (1-yes; 0-no) indicating the minority status at the student level; and x j denotes the propor-
tion of minority students in the jth school. In this model the macro-level (e.g., school level) vari-
ables were disaggregated to the micro-level (e.g., student level). In this example, students are
assigned various school-level variables and all students in the same school are assigned the same
value on the school-level variable (e.g., x j ). The model is then run at the student level. Slope
coefficients of β1 and β 2 are the main effects of the individual level variable xij and school level
variable x j respectively. The slope coefficient β 3 is the interaction of these two variables. If
the cross-level interaction is statistically significant, we conclude that the relation between stu-
dent’s minority status and the performance score is influenced or moderated by the proportion of
minority students at the school level. This kind of model takes into account the effects of con-
textual variables on the relationships of individual explanatory variables and the dependent vari-
able at individual level.
One serious problem with this model is that it treats observations as independent though they
are not, thus leading to biased standard error estimates. In addition, in this fixed-effect model,
the variation in the intercept and slope coefficients are assumed to be perfectly explained by
group level variables without error, which is highly unlikely.
Van de Eeden (1988) and others have examined the heterogeneity of relationships problem
using a two-step approach. In Step 1, they estimated the individual level regression models for
each group separately. The assumption of invariance in the intercept and slope coefficients is
tested by running multi-group regression models with and without equality restrictions on the
coefficients across groups, using structural equation modeling software such as LISREL. If the
coefficients show significant variance across groups, then the second step is to regress each of
the regression coefficients on the contextual variables at the group level.
Although this approach enables analysts to test the significance of variations in the regres-
sion coefficients estimated in Step 1, it has several limitations. OLS models are used at both
Steps 1 and 2, even though it is technically incorrect to use OLS to estimate the standard errors
in the second step (De Leeuw & Kreft, 1986, p. 61). It is also impractical to run separate regres-
sion for each group when the number of groups is large, and particularly when the number of
observations per group is small. This approach treats the groups as unrelated and ignores the
likelihood that the groups are drawn from a larger population of groups that share common
attributes.
Given the shortcomings of traditional methods, a new statistical method, called multilevel
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

modeling is needed. Multilevel models are explicitly designed to analyze hierarchically struc-
tured data, modeling variables at both micro and macro levels simultaneously without aggrega-
tion or disaggregation. In the following section we will discuss the advantages as well as the
limitations of multilevel models for multilevel data analysis.

1.5 Advantages and limitations of multilevel modeling

The problems encountered in traditional multilevel data analysis can be readily solved by multi-
level modeling. First, the Robinson effect is avoided because multilevel models simultaneously
analyze data obtained from both the individual level as well as the context/group level. The

effects of individual and contextual variables are simultaneously examined in a multilevel

model. In addition, the assumption of observation independence is not required in multilevel
modeling because multilevel models are designed to measure and thus account for ICC in hier-
archically structured data. The inflated type-I error in statistical testing that results from obser-
vation dependence is corrected in multilevel modeling. We will discuss this issue in detail in the
section on model estimation in Chapter 2.
Multilevel modeling advances the theoretical and methodological aspects of social science
studies. Multilevel models provide a convenient analytical framework with concordance
between theoretical approaches and statistical analysis for studying data that have a hierarchical
structure. This framework enables researchers to test multilevel theories statistically by system-
atically analyzing the effects of micro and macro factors on outcome measures, as well as test-
ing cross-level interactions among macro and micro level variables. With multilevel modeling,
researchers are able to decompose variation in the outcome measure into within and between
group variations, and to understand where effects on the outcome measure are occurring, and
how the effects of individual level variables on outcome measures are moderated by the group
level variables.
These features of multilevel models facilitate a wide range of applications in social science
studies. For example, women’s child preference is one of the most important micro level vari-
ables in population studies, while the intensity/efficacy of family planning programs is one of
the most important macro level variables that influence individual fertility or the number of
children ever born (CEB) to a woman. While both the micro and the macro variables affect
women’s fertility, the strength of the effect of women’s child preference on fertility is usually
strongly moderated by the intensity/efficacy of family planning programs in the social setting
where the women live. At the same time, the strength of family planning programs on individual
fertility may vary among individuals. Program may be more effective for women who have
lower child preference.
Multilevel modeling is a particularly useful analytical approach when data are sparse. For
example, in a study of minority student performance, regression models cannot provide valid
statistical inferences if the number of minority students in the sample from a school is too small.
However, if a number of schools from which the students are sampled are available, then multi-
level models can utilize information from all sampled schools, thus compensating for sparseness
of data in particular schools with a small number of minority students. In Chapter 2 we will dis-
cuss the “empirical Bayes” estimator used in multilevel model estimation in which data from
each group are combined with the data summed over all groups. Model parameters are estimated
using individual group data separately, as well as using the entire sample with all the groups.
This approach is called “shrinkage estimation” that enables the model to “borrowing strength”
from data drawn from all the groups to “support” statistical estimation for groups with too few
cases. Sizes of groups vary considerably in most quasi-experimental studies. Therefore, sparse-
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

ness is likely to be the case for some groups. Multilevel modeling is an effective method for
analysis of such data.
Finally, multilevel models can be readily extended to study growth or change trajectories of
outcome measures over time using longitudinal data. In this case the multilevel model becomes
a growth model (GM) which is clearly superior to traditional “repeated measure” analysis meth-
ods (Raudenbush & Bryk, 2002). The GM examines not only intra-individual changes over time,
but also inter-individual variations in these changes. Other benefits of GM include: 1) GM does
not require the data to be balanced (i.e. equal numbers of cases at each time) nor equal intervals
between time points. 2) Missing values caused by attrition are allowed under the assumption of
“missing at random” (MAR). 3) In a GM, time-varying covariates can be readily included. 4)
The association between the rate of outcome change and the initial level of the outcome measure

can be assessed. And 5) GM can be readily expanded to a three or more level model by includ-
ing more layers or higher level units to which individuals belong. We will provide a detailed
discussion of the GM in Chapter 4.
Of course, no statistical method is a panacea for data analysis in social science studies. Limi-
tations of multilevel models include, but are not limited to:
1) Multilevel models are more complex than ordinary regression models because outcome
variation at both micro and macro levels are modeled simultaneously. As a result, the number of
model parameters can become large so that the model may become less parsimonious.
2) Very often the higher level units or groups are selected on the basis of convenience rather
than being randomly sampled from a well-defined population. In these cases it is potentially in-
correct to infer model parameter estimates to other groups.
3) Very often, contextual variables are measured by aggregating individual data within
groups in the sample rather than the groups in the target population. If the number of observa-
tions in a group is not large enough, then the aggregated composition measures of the group
may be biased and result in misleading group information.
4) Individuals are mobile, and it is reasonable to expect that members of a group may not
have entered the group at the same time. Length of exposure to group influences may have sys-
tematic effects upon individuals. One possible solution to this problem is to control for duration
of group membership in data analysis. Unfortunately, information on membership duration is
frequently unavailable. We are then left with no alternative but to assume that everyone in a
group is affected by the group in the same way and to the same degree.
5) Like traditional regression analysis, all explanatory variables are typically assumed to
have no measurement error. In fact, very often variable measurement includes errors and may
vary from measurement to measurement. For example, measurements of a person’s blood pres-
sure are likely to vary, even under the same conditions. Some of this variation is simply random
measurement error. Structural equation models (SEM) are designed to assess underlying con-
ceptual relationships while controlling for measurement error (Jöreskog, 1971, 1977; Jöreskog
& Sörbom, 1979; Bentler, 1983; Bollen, 1989). However, multilevel models we introduce in this
book assume that all explanatory variables are measured without errors. Readers who are inter-
ested in multilevel structural equation modeling, which deals with multilevel data and measure-
ment errors simultaneously, are referred to Muthén & Muthén (1998—2010).
6) In multilevel modeling, researchers often encounter data with a relatively small number
of higher level units or groups. As a result of this and/or non-normality of the residuals, model
parameter estimates, particularly the variance components and standard errors of parameter
estimates at the group level, may be biased (Bussing, 1993; Van der Leeden & Bussing, 1994;
Van der Leeden et al., 1997; Hox, 1998). We will discuss how to deal with this issue in Chapter 6.
7) The multilevel data considered in this book are assumed to be completely nested. That is,
each individual belongs to only one group. If individuals are nested within more than one group,
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

then mixed models with crossed random effects should be applied (Raudenbush, 1993). This topic
is beyond the scope of this book.

1.6 Computer software for multilevel modeling

It was technically impossible to fit multilevel models until the early 1980s when Dr. William
Mason and colleagues at the University of Michigan Population Studies Center developed the
GENMOD computer program. This software was developed to run in DOS environment and
was unfortunately never upgraded to MS Windows. However, a growing number of computer

programs have been developed to fit multilevel models in the past two decades. In recent years
the major statistical software companies, such Statistical Analysis System (SAS) and Statistical
Package for Social Sciences (SPSS) have incorporated procedures or modules for multilevel
modeling. Each of the available programs has strengths and weaknesses. The choice of com-
puter software is a matter of personal preference. The major computer software currently avail-
able for multilevel modeling, includes, but is not limited to:
• HLM: This is the first commercial computer software designed for multilevel modeling.
This user-friendly program was initially developed in the mid-1980s and has been in active
development ever since. HLM was a leading package during the development of multilevel
modeling in the 1990s, and has been widely used since. The program is developed by Drs S.
W. Raudenbush and A. S. Bryk and distributed by the Scientific Software International (SSI)
in U.S.A. (www.ssicentral.com).
• MLwin: This is a popular special-purpose computer program for multilevel modeling devel-
oped by Dr. Harvey Goldstein and his colleagues in the Centre for Multilevel Modelling and
other institutes in UK. MLwin was first released in 1997 and Version 2 was released in beta
form in 2003 (www.mlwin.com). The program provides a system for the specification and
analysis of a wide range of multilevel models. A graphical user interface (GUI) is available
for model specification, along with plotting, diagnostic and data manipulation tools.
• Mixed-Up Suite and SuperMix: This is a family of standalone programs including MIXREG,
MIXOR, MIXNO, MIXPREG, and MIXGSUR for multilevel modeling for continuous, bi-
nary, ordinal, nominal, count, or survival outcome measures. The software was developed by
Drs. Donald Hedeker and Robert D. Gibbons of the University of Illinois at Chicago,
U.S.A., and Version 1.0 was released by SSI in 2008. (www.ssicentral. com/supermix).
• aML: Computer software developed by American economists Lee Lillard and Stan Panis be-
came commercially available in 2000. The software offers a wide range of models for multi-
level data analysis. It extends multilevel modeling to fitting econometric models such as si-
multaneous equation models with multilevel data. aML is a product of EconWare, a California
corporation, U.S.A. Full details on ordering aML can be found at www.applied-ml.com.
• EGRET: Computer software designed for analyzing biomedical and epidemiological data. It
has been widely used by epidemiologists and biostatisticians for fitting generalized linear
models with and without random effects and survival models. The software was originally
developed in 1999 at the School of Public Health of University of Washington U.S.A. The
current Window version is developed by CYTEL Software Corporation of Cambridge,
U.S.A. (www.cytel.com/products/egret).
• LISREL: is the earliest and still very popular statistical package for structural equation mod-
eling (SEM) 2 . LISREL began implementing multilevel modeling in 1999, with version 8.30.
MULTILEV fits multilevel linear and nonlinear models to multilevel data from simple ran-
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

dom and complex survey designs. It allows for models with continuous and categorical
response variables. LISREL is distributed by the Scientific Software International (SSI) in
U.S.A. (www.ssicentral.com).
• Mplus: Initially released in 1998, Mplus provides a generalized modeling framework for
structural equation modeling with continuous and categorical observed variables, as well as
continuous and categorical latent variables. Mplus V.3. and subsequent versions include a

2 Structural equation modeling is a more generalized statistical analysis approach. It provides a powerful analytic
framework that allows for the simultaneous estimation of the relations between a set of observed variables and a smaller set
of underlying latent constructs, as well as the relations among the latent constructs (Bentler, 1980, 1983; Jöreskog, 1971a,
1971b; Jöreskog & Sörbom, 1979; Bollen, 1989).

multilevel extension of the full modeling framework. Multilevel structural equation models
can be readily run in Mplus. Detail information about Mplus is available at www.statmodel.com.
• STATA: A general purpose statistical program has been used increasingly in recent years
(www.stata.com). STATA provides a broad statistical base for data analysis. Its treatment of
generalized linear mixed models has capacity of implementing multilevel modeling for con-
tinuous, binary, and count outcomes, as well as crossed random effects model.
• SPSS: Statistical Package for Social Sciences (SPSS), another major statistical package sup-
ports multilevel modeling with the Linear Mixed Models procedure in the Advanced Models
module since version 11.5. With SPSS, most commands are available either through the
graphical user interface or through the use of command syntax. SPSS is distributed by the
Scientific Software International (SSI) in U.S.A. (www.ssicentral.com).
• SAS: The internationally recognized Statistical Analysis System (SAS or SAS/STAT) has
provided several procedures adaptable to multilevel modeling since Release 6.0. The PROC
MIXED has been improved and now is widely used for multilevel modeling for continuous
outcome measures. Two other procedures, PROC GLIMMIX and PROC NLMIXED, are
designed to fit multilevel models for categorical outcome measures and count data (e.g.,
multilevel logit, probit, ordered logit, multinomial logit, Poisson, and ZIP models). Informa-
tion about obtaining SAS is available at www.sas.com.
• A variety of other statistical packages, such as LIMDEP (www.limdep.com), GenStat
([email protected]), SYSTAT (www.systat.com), S-Plus (www.insightful.com),
WINBUGS (www.mrc-bsu.cam.ac.uk/bugs), and R (cran.r-project.org) also have functions
for conducting multilevel modeling.
Throughout this book, we use SAS, specifically SAS PROC MIXED，PROC NLMIXED and
PROC GLIMMIX for analysis of continuous and categorical data in multi-level models.
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.

QM III Multilevel Modeling Luke 2020
No ratings yet
QM III Multilevel Modeling Luke 2020
43 pages
In The Social Sciences (Pp. 95-124) - New York: Routledge
No ratings yet
In The Social Sciences (Pp. 95-124) - New York: Routledge
45 pages
2007 03 Multilevel Modelling
No ratings yet
2007 03 Multilevel Modelling
55 pages
Chapter 22 - Elements of Hierarchical Regression Models
No ratings yet
Chapter 22 - Elements of Hierarchical Regression Models
21 pages
Multilevel Analysis Techniques and Applications Th... - (1. Introduction To Multilevel Analysis)
No ratings yet
Multilevel Analysis Techniques and Applications Th... - (1. Introduction To Multilevel Analysis)
7 pages
Hierarchical Linear Models Guide
No ratings yet
Hierarchical Linear Models Guide
129 pages
8) Multilevel Analysis
No ratings yet
8) Multilevel Analysis
41 pages
(Ebook) Multilevel and Longitudinal Modeling With IBM SPSS by Ronald H. Heck, Scott L. Thomas, Lynn N. Tabata ISBN 9780415817103, 0415817102 Download Full Chapters
100% (2)
(Ebook) Multilevel and Longitudinal Modeling With IBM SPSS by Ronald H. Heck, Scott L. Thomas, Lynn N. Tabata ISBN 9780415817103, 0415817102 Download Full Chapters
101 pages
Multilevel Statistical Models (4th Edition) Goldstein
No ratings yet
Multilevel Statistical Models (4th Edition) Goldstein
10 pages
(2011) - Nezlek, J. B. Multilevel Modeling For Social and Personality Psychology
No ratings yet
(2011) - Nezlek, J. B. Multilevel Modeling For Social and Personality Psychology
122 pages
Practical Multilevel Modeling Using R (Advanced Quantitative Techniques in The Social Sciences) 1st Edition Huang Instant Download
No ratings yet
Practical Multilevel Modeling Using R (Advanced Quantitative Techniques in The Social Sciences) 1st Edition Huang Instant Download
90 pages
Multilevel Statistical Models 4th Edition Harvey Goldstein (Auth.) Complete Edition
No ratings yet
Multilevel Statistical Models 4th Edition Harvey Goldstein (Auth.) Complete Edition
169 pages
Book For Ds
No ratings yet
Book For Ds
69 pages
Lorah 2018 Efect Size in MLM
No ratings yet
Lorah 2018 Efect Size in MLM
11 pages
Lavaan Multilevel Zurich2017
100% (1)
Lavaan Multilevel Zurich2017
162 pages
1.introduction Beamer Post
No ratings yet
1.introduction Beamer Post
54 pages
Multilevel Modeling for Educators
No ratings yet
Multilevel Modeling for Educators
28 pages
Multinivel Modelo Estadística SPSS
No ratings yet
Multinivel Modelo Estadística SPSS
7 pages
Look at PISA
No ratings yet
Look at PISA
25 pages
(Chapman & Hall - CRC Statistics in The Social and Behavioral Sciences) Bolin, Jocelyn - Finch, Holmes - Multilevel Modeling Using Mplus (2017, CRC Press LLC - Chapman and Hall - CRC)
No ratings yet
(Chapman & Hall - CRC Statistics in The Social and Behavioral Sciences) Bolin, Jocelyn - Finch, Holmes - Multilevel Modeling Using Mplus (2017, CRC Press LLC - Chapman and Hall - CRC)
336 pages
(Ebook) Multilevel Analysis: Techniques and Applications by Joop J. Hox, Mirjam Moerbeek, Rens Van de Schoot ISBN 9781138121362, 1138121363 2025 Instant Download
100% (2)
(Ebook) Multilevel Analysis: Techniques and Applications by Joop J. Hox, Mirjam Moerbeek, Rens Van de Schoot ISBN 9781138121362, 1138121363 2025 Instant Download
158 pages
Education Research Models
No ratings yet
Education Research Models
21 pages
(Ebook) Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman, Jennifer Hill ISBN 9780521867061, 0521867061 Full
No ratings yet
(Ebook) Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman, Jennifer Hill ISBN 9780521867061, 0521867061 Full
132 pages
Stephen E. Humphrey and James M. Lebreton
No ratings yet
Stephen E. Humphrey and James M. Lebreton
6 pages
(Ebook) Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman, Jennifer Hill ISBN 9780521867061, 0521867061 PDF Download
No ratings yet
(Ebook) Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman, Jennifer Hill ISBN 9780521867061, 0521867061 PDF Download
116 pages
Haphazard PDF
No ratings yet
Haphazard PDF
31 pages
(Statistics in The Social and Behavioral Sciences) W. Holmes Finch, Jocelyn E. Bolin, Ken Kelley - Multilevel Modeling Using R, 3rd Edition-Chapman & Hall - CRC (2024)
No ratings yet
(Statistics in The Social and Behavioral Sciences) W. Holmes Finch, Jocelyn E. Bolin, Ken Kelley - Multilevel Modeling Using R, 3rd Edition-Chapman & Hall - CRC (2024)
339 pages
MG Sent R1
No ratings yet
MG Sent R1
19 pages
An Introduction To Multilevel Modeling Techniques MLM and SEM Approaches 4th Edition Ronald H. Heck Full
100% (1)
An Introduction To Multilevel Modeling Techniques MLM and SEM Approaches 4th Edition Ronald H. Heck Full
93 pages
Multilevel Model Analysis Using R: Nicolae-Marius Jula
No ratings yet
Multilevel Model Analysis Using R: Nicolae-Marius Jula
12 pages
Vocabulary Slam and An Introduction To Linear Mixed Models (LMMS)
No ratings yet
Vocabulary Slam and An Introduction To Linear Mixed Models (LMMS)
26 pages
Multilevel Structural Equation Modeling PDF
100% (1)
Multilevel Structural Equation Modeling PDF
150 pages
Multilevel & Cluster Analysis Guide
No ratings yet
Multilevel & Cluster Analysis Guide
13 pages
Multilevel Modeling Using R 1st Edition Edition W. Holmes Finch Download
100% (3)
Multilevel Modeling Using R 1st Edition Edition W. Holmes Finch Download
71 pages
(Advanced Quantitative Techniques in The Social Sciences 1) Stephen W. Raudenbush, Anthony S. Bryk-Hierarchical Linear Models - Applications and Data Analysis Methods-SAGE Publications (2002)
No ratings yet
(Advanced Quantitative Techniques in The Social Sciences 1) Stephen W. Raudenbush, Anthony S. Bryk-Hierarchical Linear Models - Applications and Data Analysis Methods-SAGE Publications (2002)
510 pages
Multilevel Modeling Using R Chapter 1-4
No ratings yet
Multilevel Modeling Using R Chapter 1-4
81 pages
Multilevel Modeling Using R - Finch Bolin Kelley
100% (2)
Multilevel Modeling Using R - Finch Bolin Kelley
82 pages
Livro de Estatistica PDF
No ratings yet
Livro de Estatistica PDF
623 pages
Pols0010t2 Lec6 Handout
No ratings yet
Pols0010t2 Lec6 Handout
45 pages
An Introduction To Hierarchical Linear Modeling
No ratings yet
An Introduction To Hierarchical Linear Modeling
18 pages
Hierarchical Models For Causal Effects1
No ratings yet
Hierarchical Models For Causal Effects1
16 pages
Data Analysis Using Regression and Multilevel Hierarchical Models 1st Edition Andrew Gelman 2025 Download Now
No ratings yet
Data Analysis Using Regression and Multilevel Hierarchical Models 1st Edition Andrew Gelman 2025 Download Now
90 pages
Glossary Multilevel Analysis
No ratings yet
Glossary Multilevel Analysis
8 pages
Multilevel Statistical Models Wiley Series in Probability and Statistics 4th Edition Harvey Goldstein Download
100% (6)
Multilevel Statistical Models Wiley Series in Probability and Statistics 4th Edition Harvey Goldstein Download
61 pages
5538data Analysis Using Regression and Multilevel Hierarchical Models 1st Edition Andrew Gelman Online Version
No ratings yet
5538data Analysis Using Regression and Multilevel Hierarchical Models 1st Edition Andrew Gelman Online Version
116 pages
Topic 4 Completo.
No ratings yet
Topic 4 Completo.
6 pages
Social Class Impact on Education Study
No ratings yet
Social Class Impact on Education Study
5 pages
Multilevel Analysis Techniques and Applications 3rd Edition Joop J. Hox Kindle & PDF Formats
100% (1)
Multilevel Analysis Techniques and Applications 3rd Edition Joop J. Hox Kindle & PDF Formats
73 pages
The Handbook of Multilevel Theory, Measurement, and Analysis (Stephen E. Humphrey James M. Lebreton) (Z-Library)
No ratings yet
The Handbook of Multilevel Theory, Measurement, and Analysis (Stephen E. Humphrey James M. Lebreton) (Z-Library)
633 pages
Bauer 2006
No ratings yet
Bauer 2006
22 pages
The Handbook of Multilevel Theory, Measurement, and Analysis All Chapter
100% (10)
The Handbook of Multilevel Theory, Measurement, and Analysis All Chapter
16 pages
Multilevel Analysis An Introduction To Basic and Advanced Multilevel Modeling 2nd Edition Tom A. B. Snijders Download
100% (4)
Multilevel Analysis An Introduction To Basic and Advanced Multilevel Modeling 2nd Edition Tom A. B. Snijders Download
71 pages
Am (101-120) Analisis Multinivel
No ratings yet
Am (101-120) Analisis Multinivel
20 pages
Mixed Models For Multilevel Data Analysis: An Applied Introduction
No ratings yet
Mixed Models For Multilevel Data Analysis: An Applied Introduction
28 pages
Measurement of Academic Growth of Students
No ratings yet
Measurement of Academic Growth of Students
14 pages
Longitudinal Multilevel Models Guide
No ratings yet
Longitudinal Multilevel Models Guide
55 pages
Multinivel Analysis for Sociologists
No ratings yet
Multinivel Analysis for Sociologists
21 pages
App Letter 12a PDF
0% (1)
App Letter 12a PDF
2 pages
Repair & Rehab of Structures Course
No ratings yet
Repair & Rehab of Structures Course
2 pages
WEEK 4 - Hiking PPT With Youtube Links
No ratings yet
WEEK 4 - Hiking PPT With Youtube Links
25 pages
Class 12 Physics Electricity Experiment
No ratings yet
Class 12 Physics Electricity Experiment
18 pages
MCQ
67% (3)
MCQ
274 pages
Civil Engg: Structural Analysis Basics
100% (1)
Civil Engg: Structural Analysis Basics
32 pages
Unit Ii
No ratings yet
Unit Ii
17 pages
Soalan KBAT Biologi 2015: Organisasi Sel
No ratings yet
Soalan KBAT Biologi 2015: Organisasi Sel
5 pages
Akshatha Paper
No ratings yet
Akshatha Paper
7 pages
Canara - Epassbook - 2024-05-13 09:12:52.002054
No ratings yet
Canara - Epassbook - 2024-05-13 09:12:52.002054
65 pages
Advances in Carbohydrate Chemistry and Biochemistry Secure Ebook Download
No ratings yet
Advances in Carbohydrate Chemistry and Biochemistry Secure Ebook Download
17 pages
Material Safety Data Sheet Avafulflow
No ratings yet
Material Safety Data Sheet Avafulflow
4 pages
How To Mount A Remote File System Using Network File System (NFS)
No ratings yet
How To Mount A Remote File System Using Network File System (NFS)
3 pages
A Project Report ON: Online Payroll Management System
No ratings yet
A Project Report ON: Online Payroll Management System
52 pages
OBG Latest Drug
No ratings yet
OBG Latest Drug
71 pages
Boarding Pass: Name Booking Code Ticket No
No ratings yet
Boarding Pass: Name Booking Code Ticket No
1 page
Stationary List
No ratings yet
Stationary List
3 pages
Rabbit Silage Study
No ratings yet
Rabbit Silage Study
36 pages
Risk Assessment For General Activities
75% (4)
Risk Assessment For General Activities
25 pages
Tutorial 07-MA 1063
No ratings yet
Tutorial 07-MA 1063
2 pages
Company Profile PDF
No ratings yet
Company Profile PDF
38 pages
Blog Hubspot Com Marketing Team Structure Diagrams
No ratings yet
Blog Hubspot Com Marketing Team Structure Diagrams
13 pages
Mobiltech Presentation
100% (1)
Mobiltech Presentation
27 pages
Research On The Impact of E-Commerce On Offline Re
No ratings yet
Research On The Impact of E-Commerce On Offline Re
5 pages
Vehicle Collision With Student Pedestrians Crossing in Rochester Indiana NTSB Report
No ratings yet
Vehicle Collision With Student Pedestrians Crossing in Rochester Indiana NTSB Report
70 pages
Reflective Essay On Module
No ratings yet
Reflective Essay On Module
5 pages
7095 Aow10t Exemple
No ratings yet
7095 Aow10t Exemple
2 pages
GoAnywhere System Architecture Guide
No ratings yet
GoAnywhere System Architecture Guide
29 pages
Instruction Manual: P/N 30-2131-XXX Pressure Sensors
No ratings yet
Instruction Manual: P/N 30-2131-XXX Pressure Sensors
2 pages

Multilevel Models Applications Using SAS® - (1 Introduction)

Uploaded by

Multilevel Models Applications Using SAS® - (1 Introduction)

Uploaded by

Chapter 1

plied in a wide range of studies in the social sciences.

1.1 Conceptual framework of multilevel modeling

1.2 Hierarchically structured data

1.3 Variables in multilevel data

tistical analysis, individual explanatory variables usually include socio-demographic characteris-

Table 1.3.1 Individual level data

Table 1.3.3 Individual and group level mixed data

1.4 Analytical problems with multilevel data

1 For simplicity, only one independent variable is included in each model.

Group level model:

vidual or group data.

1.5 Advantages and limitations of multilevel modeling

effects of individual and contextual variables are simultaneously examined in a multilevel

1.6 Computer software for multilevel modeling

You might also like