Multilevel Models Applications Using SAS® - (1 Introduction)
Multilevel Models Applications Using SAS® - (1 Introduction)
Introduction
Over the past two decades multilevel models (Mason et al., 1983; Bryk & Raudenbush, 1992;
Raudenbush & Bryk, 2002; Goldstein, 1987, 1995) have gained popularity in various research
fields including education, psychology, sociology, economics, and public health. Multilevel
models extend ordinary least square (OLS) regression to analyze multiple level data or hierar-
chically structured data that involve both micro and macro observation information. Multilevel
models also appear under different names in the literature, including hierarchical linear models
(Bryk & Raudenbush, 1992; Raudenbush & Bryk, 2002), random-effect models (Laird & Ware,
1982), random coefficient models (DeLeeuw & Kreft, 1986), variance component models
(Dempster et al., 1981), mixed models (Longford, 1987), and empirical Bayes models (Strenio et
al., 1983).
Prior to the development of formal statistical methodology for multilevel models, sociolo-
gists engaged in contextual or multilevel analysis of hierarchically structured data. In the late
1950s and early 1960s Lazarsfeld (1961) and Merton (1957) at Columbia University began to
assess contextual effects on individual behavior. The 1970s witnessed a significant jump in
analysis of multilevel data in education (Barr & Dreeben, 1977; Block & Burns, 1976; Bronfen-
brenner, 1976; Burstein, 1980; Cronbach, 1976; Herriot & Muse, 1973; Pedhazur, 1975; Snow,
1976; Spady, 1973; Walberg, 1976). In a systematic study of contextual analysis, Boyd and
Iversen (1979) discussed how to model multilevel data with micro-macro models, i.e., to formu-
late within group regression model at individual level, then relate the within group regression
coefficients to contextual variables that describe the groups. Although multilevel observations
are discussed in their models, their estimation was conducted using ordinary least square (OLS)
techniques that were inappropriate for multilevel analysis.
Statistical theories of multilevel models and corresponding computer programs were devel-
oped in early 1980s by sociologists and demographers. Models were applied to analyze the large
scale multilevel data of the United Nation’s World Fertility Survey (WFS) (Hermalin & Mason,
1980; Mason et al., 1983). Further methodological and substantive work in educational studies
and the user-friendly windows-based computer programs by Bryk & Raudenbush (1992) and
Goldstein (1987, 1995) have popularized the multilevel models. Multilevel models are now ap-
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
A key concept in social sciences is that a society can be described in hierarchical structures. By
hierarchy, we mean that units at a lower level are nested within or grouped into units at a higher
level. People cannot be treated as isolated individuals but as social beings. Individuals are mem-
bers of many different types of groups and are embedded in different social contexts. For exam-
ple, individuals belong to families, neighborhoods, organizations and communities. Awareness
has been mounting that individual behaviors and outcomes are affected not only by individual
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
2 Chapter 1 Introduction
characteristics, but also shaped by the social contexts in which they are imbedded (Lazarsfeld,
1961; Merton, 1957; Bronfenbrenner, 1976; Blalock, 1984; Iversen, 1991). Davis’ so-called
“frog-pond” theory proposes that individual students evaluate personal ability relative to
in-groups and pay little attention to out-groups (Davis, 1966). A moderately intelligent student
(a medium size “frog”) in a highly intelligent school (a large “pond”) may become discouraged
and thus become an under-achiever, while the same student in a considerably less intelligent
school (a small “pond”) may gain confidence and become an over-achiever. On the contrary, a
moderately intelligent student might be motivated to study harder in a highly intelligent school
and become more successful. In other words, the effect of an individual student’s intelligence on
his/her achievement may be influenced by specific features in the school he/she attends. In addi-
tion to composite measures (e.g., average intelligence level), student academic achievement may
also be influenced by a variety of school level variables such as student/teacher ratio, teachers’
work experience, school facilities, budget, etc.
The relationships between academic achievement and individual level variables vary across
schools. For example, differences in academic achievement among ethnic groups may be larger
in some schools and smaller in others. In such cases the extent of the effect of ethnicity on aca-
demic achievement may relate to identifiable school-level characteristics. On the other hand, the
school’s effect on student academic achievement may also vary among individuals. For exam-
ple, while students usually benefit from smaller student/teacher ratios, these ratios and other
school features are unlikely to influence all types of students equally. Cross-level interactions in
multilevel modeling enable us to assess the degree to which relationships between individual
explanatory and outcome variables are moderated by group level variables.
Good examples of this class of multilevel studies can be drawn from population studies. It is
well-known that fertility levels vary among countries. In general, fertility is low in developed
countries and high in developing countries. Fertility has multi-level determinants. Individual
fertility behavior is determined not only by individual characteristics, such as a couple’s prefer-
ence for children, ethnicity, education, and income at the micro level. Features of the social
contexts or social environments where the individuals live, such as culture or subculture, GDP,
average education level, and in particular, the intensity/efficacy of the family planning programs
(FPP) at the macro-level can also produce measurable effects. Assessing cross-level interactions
is very important in fertility studies. FPP analysts and officers are interested in knowing: What
individual characteristics influence individual fertility behaviors? Do family planning programs
work? How do differences in program implementation among various locations or macro-level
units affect individual fertility behaviors? And, for what classes of people are family planning
programs most effective? Multilevel modeling helps us to gauge how family planning programs
interact with individual characteristics to affect fertility behavior.
Public health studies indicate that individual health behaviors and outcomes are jointly
determined by individual and environmental factors (Von Korff et al., 1992; Duncan et al., 1996;
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
Diez-Roux, 1998; Wang et al., 1998). For example, initiation of smoking among adolescents
may be associated with gender, ethnicity, school achievement and family background, as well as
the social setting in which the individual is imbedded, such as geographic location, prevalence
of smoking, and restrictions on smoking in public areas.
From these examples we can see that research interest in social science studies often centers
on questions like: 1) what and how explanatory variables measured at the individual level affect
the individual-level dependent variable, 2) what and how variables measured at the context or
group level affect the individual-level dependent variables, 3) how the relationships between the
individual-level explanatory and dependent variables vary across contexts or groups, and 4)
what and how group-level variables moderate the effects of individual level variables on the
individual-level dependent variable.
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
1.2 Hierarchically structured data 3
To answer these questions, both micro and macro data are needed. A common challenge in
multilevel data is within-group observation dependence. That is, individuals in the same group
tend to be alike and share similar attitudes and behaviors relative to individuals from other
groups. For example, people living in the same neighborhood may share similarities with each
other because they are influenced by the same neighborhood socio-economic characteristics.
This may be true even for groupings that are only recently established. For example, students
who are in the same school may not be associated with each other before they get into the same
school. Once students enter a school, they become members of the same group. Once groupings
are established, individuals in the group will tend to share traits that differentiate them from
members of other groups. In statistical terms we say there exist within-group homogeneity and
between-group heterogeneity in the hierarchically structured data.
Traditional analytical methods such as Ordinary Least Squares (OLS) Regression assume
that observations are independently, identically distributed (IID). The same assumption is
required for generalized linear models. Violation of this assumption will result in incorrect
inference in statistical analysis. Chapter 2 demonstrates how observation dependence can be
measured using an Intraclass Correlation Coefficient (ICC). Studies show that even a small ICC
can lead to substantial Type-I errors in statistical testing, thus falsely rejecting a true null
hypothesis. Dealing with ICC has been a challenge in statistical analysis of multilevel data for
many years.
Multilevel models provide an appropriate analytical framework to deal with observation
dependence in multilevel data. More importantly, multilevel models permit us to explore the
nature and extent of the relationships at both micro and macro levels, as well as across levels.
Hierarchical social structures naturally give rise to hierarchical or multilevel data in which the
lower level units are nested or grouped in the next higher level units. Such hierarchically struc-
tured data exist in many real life situations. The simplest and the most often used multilevel data
are collected at two levels (i.e., one micro level and one macro level). For example, a study on
student academic achievement may collect information at the student level and at the school
level for multilevel modeling.
Multilevel designs can be readily extended to more than two levels. For example, students
are nested in classes, and classes nested in schools; thus observation units lie at three levels of a
hierarchy: the level-1 units are students; the level-2 units are classes; and the level-3 units are
schools. The lowest level units (e.g., students) are the micro-level units or individual units,
while the higher level units are the macro level units or context/group units.
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
Hierarchically structured data may arise in a variety of forms and from a variety of situa-
tions, either observed or by design. Survey data obtained from a complex sampling design are
hierarchically structured. Multi-stage or cluster sampling is conducted to take full advantage of
information from a hierarchy of study units. The first stage or “Primary Sampling Unit” (PSU)
is often a well-defined geographic unit (e.g., county in a state). Once the PSUs are randomly
selected, further stages of random selection are carried out within the PSUs (e.g., districts in a
county) until the final units (e.g., households or individuals) are selected (Kalton, 1983). As a
result, the survey data collected from cluster sampling design have a hierarchical structure in
which individuals are nested within higher level sampling units.
Hierarchical data also frequently arise from experimental designs. For example, clinical
trails may be carried-out in randomly selected clinics or medical centers, thus creating data that
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
4 Chapter 1 Introduction
have a hierarchical structure. However, in practice clinics and medical centers are often not ran-
domly selected. This is also true in many multi-site research projects. For example, a national
multi-site research project on public health is often conducted with many project sites located in
different regions, states, or cities. Very often, rather than being randomly selected, project sites
are selected based on the quality of the grant proposals, the level of seriousness of the health
problems under study, or the feasibility of conducting a successful study in a specific site. Al-
though the distribution of the project sites may be carefully taken into consideration, they are
not randomly selected, thus they are not representative of the corresponding higher level units in
the targeted population. As a result, inferences based on the multilevel analysis for non-
randomly selected study sites should be interpreted with caution.
Hierarchical data structures are not confined to cross-sectional settings with multiple units.
Individuals may also be higher level units. For example, in longitudinal or panel studies indi-
viduals are followed up over time. Data are collected repeatedly from the same individuals.
Such longitudinal data can be considered hierarchically structured. The repeated measures for
each individual at different times are level-1 observation units, and individuals become the
level-2 units. A third level can be introduced into the data structure, if the higher level units
(e.g., clinics) in which individuals are nested are available.
Depending on the situation, some individuals may be considered as level-1 units while other
individuals are higher level units. For example, patients and doctors can form a multilevel data
structure. As a doctor treats multiple patients, the doctor may be considered as a level-2 unit,
while patients are level-1 units. Similar situations include teachers and students, coaches and
athletes, as well as interviewers and interviewees.
Finally, a special type of hierarchical data arise from meta-analysis in which results or find-
ings from a series of related studies are summarized quantitatively to assess consistency or
inconsistency of study results (Glass, 1976). In meta-analysis data, individuals are nested within
specific studies. However, it is usually difficult or impossible for researchers to obtain the raw
data from all the studies of interest. As such, a special approach is required for multilevel mod-
eling of meta-analysis data. Detailed examples of formulating multilevel models for meta-
analysis are available in many studies (Goldstein et al., 2000; Raudenbush & Bryk, 2002).
In addition to the format of multilevel data, choosing the variables that describe the features of
the distinct levels of the hierarchical structure is another important consideration. For multilevel
analysis, the dependent variable is measured at the individual level and explanatory variables are
measured at both individual and group level or at both micro and macro levels. As in regular sta-
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
1.3 Variables in multilevel data 5
studies of student school performance, contextual variables could include aggregate measures
such as student gender ratio or average enrollment test scores, and school feature measures such
as school ranking, student-teacher ratio or teacher’s level of experience. The former is an aggre-
gation of student data and can be generated from the sample; the latter represents contextual
aspects of the schools that must be collected from other sources at the school level.
Contextual variables can also be categorical measures. For example, in a multilevel study on
childhood obesity in which children are level-1 units and neighborhoods are the level-2 units.
The researcher may include a dummy variable (1-yes; 0-no) to indicate whether there are fast-
food restaurants in the neighborhood because easy access to fast food may have a significant
impact on children’s diet, thus on their obesity level.
Conceptually, one might use J-1 dummy variables to represent all the contextual features of
the J groups. This approach, however, is not feasible even with a moderately large number of
groups because too many dummy variables would be needed to represent the groups.
The following tables illustrate a fictional two-level data structure. Table 1.3.1 shows the
individual level outcome variable yij and independent variable xij for the ith individual in the jth
group. There are a total of nj individuals in the jth group, and individuals in all the groups sum up
to the total sample size ∑nj = N. zj is a contextual variable describing the group. The values of
the variable zj for specific groups (j = 1, 2, …, J groups) are shown in Table 1.3.2.
Unit Variable
Group Individual yij xij
1 1 5 11
1 2 3 8
1 n1 2 7
2 1 6 12
2 2 9 10
2 n2 10 15
3 1 11 15
3 2 15 18
3 n3 16 20
J 1 4 7
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
J 2 5 9
J nJ 6 8
Note:
n1, n2, and nj — Number of individuals in the first, second, and the jth groups, respectively. ∑nj=N.
yij and xij—Individual level outcome and independent variable, respectively.
Individual level data and group level data shown in Tables 1.3.1 and 1.3.2 are integrated into
a mixed data set and shown in Table 1.3.3. When merging the data sets, both individual ID and
group ID must be matched for every individual. As such, the same value of the contextual vari-
able zj of group j is assigned to each individual in this group. Consequently, the value of zj does
not vary across individuals within the same group (see Table 1.3.3).
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
6 Chapter 1 Introduction
Table 1.3.2 Group level data
Group zj
1 8.7
2 12.3
3 17.6
J 8.0
Note:
zj — Contextual variable at the group level.
Unit Variable
Group Individual yij xij zj*
1 1 5 11 8.7
1 2 3 8 8.7
1 n1 2 7 8.7
2 1 6 12 12.3
2 2 9 10 12.3
2 n2 10 15 12.3
3 1 11 15 17.6
3 2 15 18 17.6
3 n3 16 20 17.6
J 1 4 7 8.0
J 2 5 9 8.0
J nJ 6 8 8.0
Note:
—The same value of contextual variable zj of the jth group is assigned to each individual in the group.
*
The data format for multilevel modeling varies slightly by computer programs. Some pro-
grams require separate individual and groups data sets while others work with mixed data for-
mats like the one shown in Table 1.3.3.
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
Prior to the availability of multilevel analytical techniques and computer programs, multilevel
data were analyzed separately at a single level, either the individual level or the group level1 :
Individual level model:
yij = β 0 + β1 xij + ε ij (1.4.1)
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
1.4 Analytical problems with multilevel data 7
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
8 Chapter 1 Introduction
In the past, heterogeneity of micro level relationships was often studied using the following
fixed-effect regression model:
yij = β 0 + β1 xij + β 2 x j + β 3 xij ⋅ x j + ε ij (1.4.3)
where yij denotes the performance score for the ith student in the jth school; xij is a dummy vari-
able (1-yes; 0-no) indicating the minority status at the student level; and x j denotes the propor-
tion of minority students in the jth school. In this model the macro-level (e.g., school level) vari-
ables were disaggregated to the micro-level (e.g., student level). In this example, students are
assigned various school-level variables and all students in the same school are assigned the same
value on the school-level variable (e.g., x j ). The model is then run at the student level. Slope
coefficients of β1 and β 2 are the main effects of the individual level variable xij and school level
variable x j respectively. The slope coefficient β 3 is the interaction of these two variables. If
the cross-level interaction is statistically significant, we conclude that the relation between stu-
dent’s minority status and the performance score is influenced or moderated by the proportion of
minority students at the school level. This kind of model takes into account the effects of con-
textual variables on the relationships of individual explanatory variables and the dependent vari-
able at individual level.
One serious problem with this model is that it treats observations as independent though they
are not, thus leading to biased standard error estimates. In addition, in this fixed-effect model,
the variation in the intercept and slope coefficients are assumed to be perfectly explained by
group level variables without error, which is highly unlikely.
Van de Eeden (1988) and others have examined the heterogeneity of relationships problem
using a two-step approach. In Step 1, they estimated the individual level regression models for
each group separately. The assumption of invariance in the intercept and slope coefficients is
tested by running multi-group regression models with and without equality restrictions on the
coefficients across groups, using structural equation modeling software such as LISREL. If the
coefficients show significant variance across groups, then the second step is to regress each of
the regression coefficients on the contextual variables at the group level.
Although this approach enables analysts to test the significance of variations in the regres-
sion coefficients estimated in Step 1, it has several limitations. OLS models are used at both
Steps 1 and 2, even though it is technically incorrect to use OLS to estimate the standard errors
in the second step (De Leeuw & Kreft, 1986, p. 61). It is also impractical to run separate regres-
sion for each group when the number of groups is large, and particularly when the number of
observations per group is small. This approach treats the groups as unrelated and ignores the
likelihood that the groups are drawn from a larger population of groups that share common
attributes.
Given the shortcomings of traditional methods, a new statistical method, called multilevel
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
modeling is needed. Multilevel models are explicitly designed to analyze hierarchically struc-
tured data, modeling variables at both micro and macro levels simultaneously without aggrega-
tion or disaggregation. In the following section we will discuss the advantages as well as the
limitations of multilevel models for multilevel data analysis.
The problems encountered in traditional multilevel data analysis can be readily solved by multi-
level modeling. First, the Robinson effect is avoided because multilevel models simultaneously
analyze data obtained from both the individual level as well as the context/group level. The
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
1.5 Advantages and limitations of multilevel modeling 9
ness is likely to be the case for some groups. Multilevel modeling is an effective method for
analysis of such data.
Finally, multilevel models can be readily extended to study growth or change trajectories of
outcome measures over time using longitudinal data. In this case the multilevel model becomes
a growth model (GM) which is clearly superior to traditional “repeated measure” analysis meth-
ods (Raudenbush & Bryk, 2002). The GM examines not only intra-individual changes over time,
but also inter-individual variations in these changes. Other benefits of GM include: 1) GM does
not require the data to be balanced (i.e. equal numbers of cases at each time) nor equal intervals
between time points. 2) Missing values caused by attrition are allowed under the assumption of
“missing at random” (MAR). 3) In a GM, time-varying covariates can be readily included. 4)
The association between the rate of outcome change and the initial level of the outcome measure
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
10 Chapter 1 Introduction
can be assessed. And 5) GM can be readily expanded to a three or more level model by includ-
ing more layers or higher level units to which individuals belong. We will provide a detailed
discussion of the GM in Chapter 4.
Of course, no statistical method is a panacea for data analysis in social science studies. Limi-
tations of multilevel models include, but are not limited to:
1) Multilevel models are more complex than ordinary regression models because outcome
variation at both micro and macro levels are modeled simultaneously. As a result, the number of
model parameters can become large so that the model may become less parsimonious.
2) Very often the higher level units or groups are selected on the basis of convenience rather
than being randomly sampled from a well-defined population. In these cases it is potentially in-
correct to infer model parameter estimates to other groups.
3) Very often, contextual variables are measured by aggregating individual data within
groups in the sample rather than the groups in the target population. If the number of observa-
tions in a group is not large enough, then the aggregated composition measures of the group
may be biased and result in misleading group information.
4) Individuals are mobile, and it is reasonable to expect that members of a group may not
have entered the group at the same time. Length of exposure to group influences may have sys-
tematic effects upon individuals. One possible solution to this problem is to control for duration
of group membership in data analysis. Unfortunately, information on membership duration is
frequently unavailable. We are then left with no alternative but to assume that everyone in a
group is affected by the group in the same way and to the same degree.
5) Like traditional regression analysis, all explanatory variables are typically assumed to
have no measurement error. In fact, very often variable measurement includes errors and may
vary from measurement to measurement. For example, measurements of a person’s blood pres-
sure are likely to vary, even under the same conditions. Some of this variation is simply random
measurement error. Structural equation models (SEM) are designed to assess underlying con-
ceptual relationships while controlling for measurement error (Jöreskog, 1971, 1977; Jöreskog
& Sörbom, 1979; Bentler, 1983; Bollen, 1989). However, multilevel models we introduce in this
book assume that all explanatory variables are measured without errors. Readers who are inter-
ested in multilevel structural equation modeling, which deals with multilevel data and measure-
ment errors simultaneously, are referred to Muthén & Muthén (1998—2010).
6) In multilevel modeling, researchers often encounter data with a relatively small number
of higher level units or groups. As a result of this and/or non-normality of the residuals, model
parameter estimates, particularly the variance components and standard errors of parameter
estimates at the group level, may be biased (Bussing, 1993; Van der Leeden & Bussing, 1994;
Van der Leeden et al., 1997; Hox, 1998). We will discuss how to deal with this issue in Chapter 6.
7) The multilevel data considered in this book are assumed to be completely nested. That is,
each individual belongs to only one group. If individuals are nested within more than one group,
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
then mixed models with crossed random effects should be applied (Raudenbush, 1993). This topic
is beyond the scope of this book.
It was technically impossible to fit multilevel models until the early 1980s when Dr. William
Mason and colleagues at the University of Michigan Population Studies Center developed the
GENMOD computer program. This software was developed to run in DOS environment and
was unfortunately never upgraded to MS Windows. However, a growing number of computer
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
1.6 Computer software for multilevel modeling 11
programs have been developed to fit multilevel models in the past two decades. In recent years
the major statistical software companies, such Statistical Analysis System (SAS) and Statistical
Package for Social Sciences (SPSS) have incorporated procedures or modules for multilevel
modeling. Each of the available programs has strengths and weaknesses. The choice of com-
puter software is a matter of personal preference. The major computer software currently avail-
able for multilevel modeling, includes, but is not limited to:
• HLM: This is the first commercial computer software designed for multilevel modeling.
This user-friendly program was initially developed in the mid-1980s and has been in active
development ever since. HLM was a leading package during the development of multilevel
modeling in the 1990s, and has been widely used since. The program is developed by Drs S.
W. Raudenbush and A. S. Bryk and distributed by the Scientific Software International (SSI)
in U.S.A. (www.ssicentral.com).
• MLwin: This is a popular special-purpose computer program for multilevel modeling devel-
oped by Dr. Harvey Goldstein and his colleagues in the Centre for Multilevel Modelling and
other institutes in UK. MLwin was first released in 1997 and Version 2 was released in beta
form in 2003 (www.mlwin.com). The program provides a system for the specification and
analysis of a wide range of multilevel models. A graphical user interface (GUI) is available
for model specification, along with plotting, diagnostic and data manipulation tools.
• Mixed-Up Suite and SuperMix: This is a family of standalone programs including MIXREG,
MIXOR, MIXNO, MIXPREG, and MIXGSUR for multilevel modeling for continuous, bi-
nary, ordinal, nominal, count, or survival outcome measures. The software was developed by
Drs. Donald Hedeker and Robert D. Gibbons of the University of Illinois at Chicago,
U.S.A., and Version 1.0 was released by SSI in 2008. (www.ssicentral. com/supermix).
• aML: Computer software developed by American economists Lee Lillard and Stan Panis be-
came commercially available in 2000. The software offers a wide range of models for multi-
level data analysis. It extends multilevel modeling to fitting econometric models such as si-
multaneous equation models with multilevel data. aML is a product of EconWare, a California
corporation, U.S.A. Full details on ordering aML can be found at www.applied-ml.com.
• EGRET: Computer software designed for analyzing biomedical and epidemiological data. It
has been widely used by epidemiologists and biostatisticians for fitting generalized linear
models with and without random effects and survival models. The software was originally
developed in 1999 at the School of Public Health of University of Washington U.S.A. The
current Window version is developed by CYTEL Software Corporation of Cambridge,
U.S.A. (www.cytel.com/products/egret).
• LISREL: is the earliest and still very popular statistical package for structural equation mod-
eling (SEM) 2 . LISREL began implementing multilevel modeling in 1999, with version 8.30.
MULTILEV fits multilevel linear and nonlinear models to multilevel data from simple ran-
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
dom and complex survey designs. It allows for models with continuous and categorical
response variables. LISREL is distributed by the Scientific Software International (SSI) in
U.S.A. (www.ssicentral.com).
• Mplus: Initially released in 1998, Mplus provides a generalized modeling framework for
structural equation modeling with continuous and categorical observed variables, as well as
continuous and categorical latent variables. Mplus V.3. and subsequent versions include a
2 Structural equation modeling is a more generalized statistical analysis approach. It provides a powerful analytic
framework that allows for the simultaneous estimation of the relations between a set of observed variables and a smaller set
of underlying latent constructs, as well as the relations among the latent constructs (Bentler, 1980, 1983; Jöreskog, 1971a,
1971b; Jöreskog & Sörbom, 1979; Bollen, 1989).
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.
12 Chapter 1 Introduction
multilevel extension of the full modeling framework. Multilevel structural equation models
can be readily run in Mplus. Detail information about Mplus is available at www.statmodel.com.
• STATA: A general purpose statistical program has been used increasingly in recent years
(www.stata.com). STATA provides a broad statistical base for data analysis. Its treatment of
generalized linear mixed models has capacity of implementing multilevel modeling for con-
tinuous, binary, and count outcomes, as well as crossed random effects model.
• SPSS: Statistical Package for Social Sciences (SPSS), another major statistical package sup-
ports multilevel modeling with the Linear Mixed Models procedure in the Advanced Models
module since version 11.5. With SPSS, most commands are available either through the
graphical user interface or through the use of command syntax. SPSS is distributed by the
Scientific Software International (SSI) in U.S.A. (www.ssicentral.com).
• SAS: The internationally recognized Statistical Analysis System (SAS or SAS/STAT) has
provided several procedures adaptable to multilevel modeling since Release 6.0. The PROC
MIXED has been improved and now is widely used for multilevel modeling for continuous
outcome measures. Two other procedures, PROC GLIMMIX and PROC NLMIXED, are
designed to fit multilevel models for categorical outcome measures and count data (e.g.,
multilevel logit, probit, ordered logit, multinomial logit, Poisson, and ZIP models). Informa-
tion about obtaining SAS is available at www.sas.com.
• A variety of other statistical packages, such as LIMDEP (www.limdep.com), GenStat
([email protected]), SYSTAT (www.systat.com), S-Plus (www.insightful.com),
WINBUGS (www.mrc-bsu.cam.ac.uk/bugs), and R (cran.r-project.org) also have functions
for conducting multilevel modeling.
Throughout this book, we use SAS, specifically SAS PROC MIXED,PROC NLMIXED and
PROC GLIMMIX for analysis of continuous and categorical data in multi-level models.
Copyright © 2011. Walter de Gruyter GmbH. All rights reserved.
Wang, Jichuan, et al. Multilevel Models : Applications Using SAS®, Walter de Gruyter GmbH, 2011. ProQuest Ebook Central,
http://ebookcentral.proquest.com/lib/pretoria-ebooks/detail.action?docID=835473.
Created from pretoria-ebooks on 2024-07-23 08:36:07.