HLM7 PDF
HLM7 PDF
HLM7 PDF
SS
©
ht
rig
py
Co
I
SS
©
ht
rig
py
Co
I
SS
©
ht
rig
py
Co
I
SS
©
ht
rig
py
Co
Contents
Co
1 Conceptual and Statistical Background for Two-Level
Models .................................................................................................8
1.1 The general two-level model .................................................................................. 8
1.1.1 Level-1 model ................................................................................................................................ 9
1.1.2 Level-2 model ................................................................................................................................ 9
1.2 Parameter estimation........................................................................................... 10
1.3 Empirical Bayes ("EB") estimates of randomly varying level-1 coefficients, β q j .. 10
py
1.4 Generalized least squares (GLS) estimates of the level-2 coefficients, γ qs ......... 11
1.5 Maximum likelihood estimates of variance and covariance components ............. 11
1.6 Some other useful statistics ................................................................................. 11
1.7 Hypothesis testing................................................................................................ 12
1.8 Restricted versus full maximum likelihood ........................................................... 12
rig
1.9 Generalized Estimating Equations ....................................................................... 13
1
3 Conceptual and Statistical Background for Three-Level
Models...............................................................................................63
3.1 The general three-level model.............................................................................. 63
3.1.1 Level-1 model .............................................................................................................................. 63
3.1.2 Level-2 model .............................................................................................................................. 64
3.1.3 Level-3 model .............................................................................................................................. 65
3.1 Parameter estimation ........................................................................................... 66
Co
3.2 Hypothesis testing ................................................................................................ 67
2
7.2.4 Level-2 and Level-3 and Level-4 models................................................................................... 107
7.3 The model for count data ................................................................................... 107
7.3.1 Level-1 sampling model............................................................................................................. 107
7.3.2 Level-1 link function................................................................................................................... 108
7.3.3 Level-1 structural model ............................................................................................................ 108
7.3.4 Level-2 model ............................................................................................................................ 109
7.4 The model for multinomial data.......................................................................... 109
7.4.1 Level-1 sampling model............................................................................................................. 109
7.4.2 Level-1 link function................................................................................................................... 110
7.4.3 Level-1 structural model ............................................................................................................ 110
Co
7.4.4 Level-2 model ............................................................................................................................ 110
7.5 The model for ordinal data ................................................................................. 111
7.5.1 Level-1 sampling model............................................................................................................. 111
7.5.2 Level-1 structural model ............................................................................................................ 111
7.6 Parameter estimation......................................................................................... 112
7.6.1 Estimation via PQL .................................................................................................................... 112
7.6.2 Properties of the estimators....................................................................................................... 116
7.6.3 Parameter estimation: A high-order Laplace and adaptive Gaussian Quadrature approximation
py
of maximum likelihood ............................................................................................................................ 117
7.7 Unit-specific and population-average models .................................................... 118
7.8 Over-dispersion and under-dispersion ............................................................... 120
7.9 Restricted versus full PQL versus full ML .......................................................... 120
7.10 Hypothesis testing.............................................................................................. 120
rig
8 Fitting HGLMs (Nonlinear Models) ............................................121
8.1 Executing nonlinear analyses based on the MDM file........................................ 121
8.2 Case 1: a Bernoulli model .................................................................................. 123
8.3 Case 2: a binomial model (number of trials, mi j ≥ 1) ......................................... 130
8.4 Case 3: Poisson model with equal exposure ..................................................... 132
ht
8.5 Case 4: Poisson model with variable exposure.................................................. 134
8.6 Case 5: Multinomial model................................................................................. 135
8.7 Case 6: Ordinal model ....................................................................................... 140
8.8 Additional features ............................................................................................. 144
8.8.1 Over-dispersion ......................................................................................................................... 144
8.8.2 Adaptive Gauss-Hermite Quadrature and Laplace approximations for binary models.............. 144
©
8.8.3 Printing variance-covariance matrices for fixed effects ............................................................. 146
8.9 Fitting HGLMs with three and four levels ........................................................... 146
3
9.5.3 Level-3 model ............................................................................................................................ 153
9.5.4 Level-2 model ............................................................................................................................ 153
11 Special Features........................................................................179
11.1 Latent variable analysis...................................................................................... 179
11.1.1 A latent variable analysis using HMLM: Example 1............................................................... 179
11.1.2 A latent variable analysis using HMLM: Example 2............................................................... 182
py
11.2 Applying HLM to multiply-imputed data.............................................................. 185
11.2.1 Data with multiply-imputed values for the outcome or one covariate .................................... 185
11.2.2 Calculations performed.......................................................................................................... 186
11.2.3 Working with plausible values in HLM ................................................................................... 188
11.2.4 Data with multiply-imputed values for the outcome and covariates....................................... 189
11.3 "V-Known" models for HLM2.............................................................................. 189
rig
11.3.1 Data input format ................................................................................................................... 190
11.3.2 Creating the MDM file............................................................................................................ 190
11.3.3 Estimating a V-known model ................................................................................................. 191
11.3.4 V-known analyses where Q = 1............................................................................................. 194
11.3 Spatial dependence models for HLM2 ............................................................... 194
11.3.5 A spatial analysis using HLM2............................................................................................... 194
11.3.6 Other outcome variables ....................................................................................................... 200
ht
12 Conceptual and Statistical Background for Cross-classified
Random Effect Models (HCM2) .....................................................201
12.1 The general cross-classified random effects models ......................................... 201
12.1.1 Level-1 or "within-cell" model ................................................................................................ 202
12.1.2 Level-2 or "between-cell" model ............................................................................................ 202
©
12.2 Parameter estimation ......................................................................................... 203
12.3 Hypothesis testing.............................................................................................. 203
A.1.1 Example: constructing an MDM file for the HS&B data using SPSS file input .............................. 285
A.1.2 Example: constructing an MDM file for the HS&B data using ASCII file input .............................. 286
A.2 Rules for format statements ................................................................................... 287
A.2.1 Example: Executing an analysis using HSB.MDM ........................................................................ 288
A.3 Using HLM in batch and/or interactive mode.......................................................... 291
A.4 Using HLM2 in batch mode .................................................................................... 292
5
A.5 Printing of variance and covariance matrices for fixed effects and level-2 variances296
A.6 Preliminary exploratory analysis with HLM2 ........................................................... 297
I.2 Example: Creating an HLMHCM HLM file and running the model ........................... 342
Co
py
rig
ht
©
SS
I
7
1 Conceptual and Statistical Background for Two-Level Models
Behavioral and social data commonly have a nested structure. For example, if repeated observations
are collected on a set of individuals and the measurement occasions are not identical for all persons,
Co
the multiple observations are properly conceived as nested within persons. Each person might also
be nested within some organizational unit such as a school or workplace. These organizational units
may in turn be nested within a geographical location such as a community, state, or country. Within
the hierarchical linear model, each of the levels in the data structure (e.g., repeated observations
within persons, persons within communities, communities within states) is formally represented by
its own sub-model. Each sub-model represents the structural relations occurring at that level and the
residual variability at that level.
py
This manual describes the use of the HLM computer programs for the statistical modeling of two-,
three- and four-level data structures, respectively. It should be used in conjunction with the text
Hierarchical Linear Models: Applications and Data Analysis Methods (Raudenbush, S.W. & Bryk,
A.S., 2002: Newbury Park, CA: Sage Publications)¹.1The HLM programs have been tailored so that
rig
the basic program structure, input specification, and output of results closely coordinate with this
textbook. This manual also cross-references the appropriate sections of the textbook for the reader
interested in a full discussion of the details of parameter estimation and hypothesis testing. Many of
the illustrative examples described in this manual are based on data distributed with the program and
analyzed in the Sage text.
ht
We begin by discussing the two-level model below and the use of the HLM2 program in Chapter 2.
Building on this framework, Chapters 3 and 4 introduce the three-level model and the use of the
HLM3 program. The four-level model and the use of the HLM4 program are discussed in Chapters 5
and 6. Chapters 7 and 8 discuss use of hierarchical modeling for non-normal level-1 errors. Chapters
9 and 10 consider multivariate models that can be estimated from incomplete data. Chapter 11
©
describes several special features of HLM2 and HLM3, including analyses involving latent variables,
multiply-imputed data, and known level-1 variances, as well as the procedure for graphing data and
equations. Chapters 12 and 13 introduce two-level cross-classified random effects that are applicable
for analyses of models that do not have a strictly hierarchical data structure, and Chapters 14 and 15
discuss three-level cross-classified random effects models. Hierarchical linear models with cross-
classified random effects are considered in Chapters 16 and 17. Finally, Chapter 18 illustrates HLM's
SS
ability to produce data- and model-based graphs.
As the name implies, a two-level model consists of two submodels at level 1 and level 2. For
example, if the research problem consists of data on students nested within schools, the level-1
I
model would represent the relationships among the student-level variables and the level-2 model
would capture the influence of school-level factors. Formally, there are i = 1,..., n j level-1 units
(e.g., students) nested within j = 1,..., J level-2 units (e.g., schools).
1
Also available from SSI.
8
1.1.1 Level-1 model
We represent in the level-1 model the outcome for case i within unit j as:
β q j = γ q 0 + γ q1W1 j + γ q 2W2 j + + γ q S WS j + uq j
q q
Sq (1.2)
ht
= γ q 0 + γ q sWs j + uq j ,
s =1
where
γ q s ( q = 0,1,..., Sq ) are level-2 coefficients;
©
Ws j is a level-2 predictor; and
uq j is a level-2 random effect.
We assume that, for each unit j, the vector ( u0 j , u1 j ,..., uQ j )′ is distributed as multivariate normal,
SS
with each element of uq j having a mean of zero and variance of
9
For any pair of random effects q and q′ ,
These level-2 variance and covariance components can be collected into a dispersion matrix, T ,
whose maximum dimension is ( Q + 1) × ( Q + 1) .
We note that each level-1 coefficient can be modeled at level-2 as one of three general forms:
Co
1. a fixed level-1 coefficient; e.g.,
β q j = γ q 0, (1.5)
β q j = γ q 0 + uq j (1.7)
rig
or a level-1 coefficient with both non-random and random sources of variation,
Sq
β q j = γ q 0 + γ q sWsj + uq j (1.8)
s =1
The actual dimension of T in any application depends on the number of level-2 coefficients
ht
specified as randomly varying. We also note that a different set of level-2 predictors may be used in
each of the Q + 1 equations of the level-2 model.
These estimates of the level-1 coefficients for each unit j are optimal composites of an estimate
based on the data from that unit and an estimate based on data from other similar units. Intuitively,
we are borrowing strength from all of the information present in the ensemble of data to improve the
level-1 coefficient estimates for each of the J units. These "EB" estimates are also referred to as
"shrunken estimates" of the level-1 coefficients. They are produced by HLM as part of the residual
I
file output (see Section 2.5.4, Model checking based on the residual file). (For further discussion see
Hierarchical Linear Models, pp. 45-51; 85-95.)
10
1.4 Generalized least squares (GLS) estimates of the level-2 coefficients,
γ qs
Substitution of the level-2 equations for β q j into their corresponding level-1 terms yields a single-
equation linear model with a complex error structure. Proper estimation of the regression
coefficients of this model (i.e., the γ 's) requires that we take into account the differential precision
Co
of the information provided by each of the J units. This is accomplished through generalized least
squares. (For further discussion see Hierarchical Linear Models, pp. 38-44.)
Based on the various parameter estimates discussed above, HLM2 and HLM3 also compute a number
of other useful statistics. These include:
ht
1. Reliability of βˆq j .
The program computes an overall or average reliability for the least squares estimates of each level-
1 coefficient across the set of J level-2 units. These are denoted in the program output as
RELIABILITY ESTIMATES and are calculated according to Equation 3.58 in Hierarchical Linear
©
Models, p. 49.
Sq
ˆuq j = βˆq j − γˆq 0 + γˆq sWs j . (1.9)
s =1
These ordinary least square residuals are denoted in HLM residual files by the prefix OL before the
corresponding variable names.
I
11
3. Empirical Bayes residuals ( uq* j )
These residuals are based on the deviation of the empirical Bayes estimates, β q*j , of a randomly
varying level-1 coefficient from its predicted or "fitted" value based on the level-2 model, i.e.,
Sq
u = β − γˆq 0 + γˆq sWs j .
*
qj
*
qj (1.10)
s =1
Co
These are denoted in the HLM residual files by the prefix EB before the corresponding variable
names. (For a further discussion and illustration of OL and EB residuals see Hierarchical Linear
Models, pp. 47-48; and 76-95).
12
Table 1.1 Hypothesis tests for the level-2 fixed effects and the variance-covariance
components (continued)
Version 7 of HLM produces the following tables, often useful for comparative purposes:
standard errors that are consistent even when the OLS assumptions are incorrect.
• A table of HLM estimates of γ q s , based on GLS, and standard errors based on the assumptions
underlying HLM.
• A table of the same HLM estimates, but now accompanied by robust standard errors, that is,
standard errors that are consistent even when the HLM assumptions are mistaken.
13
By comparing these four tables, it is possible a) to discern how different the HLM estimates and
standard errors are from those based on OLS; and b) to discern whether the HLM inferences are
plausibly distorted by incorrect assumptions about the distribution of the random effects at each
level. We illustrate the value of these comparisons in Chapter 2 (for further discussion, see
Hierarchical Linear Models, pp. 276-280). The GEE approach is very useful for strengthening
inferences about the fixed level-2 coefficients but does not provide a basis for inferences about the
random, level-1 coefficients or the variance-covariance components. Cheong, Fotiu, and
Co
Raudenbush (2001) have intensively studied the properties of HLM and GEE estimators in the context
of three-level models. GEE results are also available for three-level data.
py
rig
ht
©
SS
I
14
2 Working with HLM2
Data analysis by means of the HLM2 program will typically involve three stages:
Co
1. construction of the "MDM file" (the multivariate data matrix);
2. execution of analyses based on the MDM file; and
3. evaluation of fitted models based on a residual file.
We describe each stage below and then illustrate a number of special options. Data collected from a
High School & Beyond (HS&B) survey on 7,185 students nested within 160 US high schools, as
described in Chapter 4 of Hierarchical Linear Models, will be used for demonstrations.
py
2.1 Constructing the MDM file from raw data
We assume that a user has employed a standard computing package to clean the data, make
necessary transformations, and conduct relevant exploratory and descriptive analyses. We also
rig
recommend exploratory graphical analyses within HLM prior to model building as described in detail
in Section 18.1 of this manual.
The first task in using HLM2 is to construct the Multivariate Data Matrix (MDM) from raw data or
from a statistical package. We generally work with two raw data files: a level-1 file and a level-2
file. Both files must be sorted by the level-2 ID (It is possible, however, to build the MDM file from
ht
the level-1 file above, though this option is not suggested when the level-1 file is very large. The
level-1 file must be sorted by level-2 ID. The level-1 file name will be selected as both the level-1
and level-2 file).
For the HS&B example, the level-1 units are students and the level-2 units are schools. The two files
©
are linked by a common level-2 unit ID, school id in our example, which must appear on every level-1
record. In constructing the MDM file, the HLM program will compute summary statistics based on the
level-1 unit data and store these statistics together with level-2 data.
The procedure to create a MDM file consists of three major steps. The user needs to
SS
Once the MDM file is constructed, all subsequent analyses will be computed using the MDM file as
input. It will therefore be unnecessary to read the larger student-level data file in computing these
analyses. The efficient summary of data in the MDM file leads to faster computation. The MDM file is
15
like a "system file" in a standard computing package in that it contains not only the summarized data
but also the names of all of the variables.
• Specifying the level-1 model, which defines a set of level-1 coefficients to be computed for
each level-2 unit.
• Specifying a level-2 structural model to predict each of the level-1 coefficients.
Co
• Specifying the level-1 coefficients to be viewed as random or non-random.
• Ordinary least squares and generalized least squares results for the fixed coefficients defined
in the level-2 model.
py
• Estimates of variance and covariance components and approximate chi-square tests for the
variance components.
• A variety of auxiliary diagnostic statistics.
These questions and others can be addressed by means of analyses of the HLM residual files. A
level-1 residual file includes:
©
• The level-1 residuals (discrepancies between the observed and fitted values).
• Fitted values (FV) for each level-1 unit (that is, values predicted on the basis of the model).
• The observed values of all predictors included in the model.
• Selected level-2 predictors useful in exploring possible relationships between such predictors
SS
and level-1 residuals.
• Fitted values for each level-1 coefficient (that is, values predicted on the basis of the level-2
model).
I
• Ordinary least squares (OL) and empirical Bayes (EB) estimates of level-2 residuals
(discrepancies between level-1 coefficients and fitted values).
• Empirical Bayes coefficients, which are the sum of the EB estimates and the fitted values.
• Dispersion estimates useful in exploring sources of variance heterogeneity at level 1.
16
• Expected and observed Mahalanobis distance measures useful in assessing the multivariate
normality assumption for the level-2 residuals.
• Selected level-2 predictors useful in exploring possible relationships between such predictors
and level-2 residuals.
• Posterior variances (PV).
For HLM2 FML analyses, there is an additional set of posterior variances. See Chapter 9 in
Hierarchical Linear Models for a full discussion of these methods.
Co
2.4 Windows, interactive, and batch execution
Formulation and testing of models using HLM programs can be achieved via Windows, interactive,
or batch modes. Most PC users will find the Windows mode preferable. This draws on the visual
features of Windows while preserving the speed of use associated with a command-oriented (batch)
program. Non-PC users have the choice of interactive and batch modes only. Interactive execution
py
guides the user through the steps of the analysis by posing questions and providing a menu of
options. In this chapter, we employ the Windows mode for all the examples. Descriptions and
examples on how to use HLM2 in interactive and batch modes are given in Appendix A.
In order for the program(s) to correctly read the data, the IDs need to conform to the following rules:
SS
1. For ASCII data the ID variables must be read in as character (alphanumeric). These IDs are
indicated by the A field(s) in the format statement. For all other types of data, the ID may be
character or numeric.
2. The level-1 cases must be grouped together by their respective level-2 unit ID. To assure
this, sort the level-1 file by the level-2 ID field prior to entering the data into HLM2.
I
17
example, imagine your data has IDs ranging from "1" to "100". You will need to recreate the
IDs as "001" to "100". In other words, all spaces (blank characters) should be coded as zeros.
5. For non-ASCII files, the program can only properly deal with numeric variables (with the
exception of character ID variables). Other data types, such as a "Date format", will not be
processed properly.
6. For non-ASCII files with missing data, one should only use the "standard" missing value
code. Some statistical packages (SAS, for example) allow for a number of missing value
codes. The HLM modules are incapable of understanding these correctly, thus these
Co
additional missing codes need to be recoded to the more common "." (period) code.
We first illustrate the use of SPSS file input and then consider input from ASCII data files. Data input
requires a level-1 file and a level-2 file.
py
For our HS&B example data, the level-1 file (HSB1.SAV) has 7,185 cases and four
Level-1 file.
variables (not including the SCHOOL ID). The variables are:
Data for the first ten cases in HSB1.SAV are shown in Fig. 2.1.
ht
Note: level-1 cases must be grouped together by their respective level-2 unit ID. To assure this, sort
the level-1 file by the level-2 unit ID field prior to entering the data into HLM2.
©
SS
The data for the first ten schools are displayed in Fig 2.2.
Co
py
Figure 2.2 First ten cases in HSB2.SAV
rig
As mentioned earlier, the construction of an MDM file consists of three major steps. This will now be
illustrated with the HS&B example.
19
Co
Figure 2.4 Select MDM type dialog box
py
To supply HLM with appropriate information for the data, the command, and the MDM files:
1. Select SPSS/Windows from the Input File Type pull-down menu (see Figure 2.5).
2. Specify the structure of data. The three choices are cross-sectional, longitudinal, and
measures within groups. The data in HSB1.SAV are cross-sectional.
3. Click Browse in the Level-1 Specification section to open an Open Data File dialog
rig
box.
4. Open a level-1 SPSS system file in the HLM folder (HSB1.SAV in our example). The
Choose Variables button will be activated.
ht
©
SS
I
20
5. Click Choose Variables to open the Choose Variables - HLM2 dialog box and choose
the ID and variables by clicking the appropriate check boxes (See Figure 2.6). To
deselect, click the box again.
6. Select the options for missing data in the level-1 file (there is no missing data in
HSB1.SAV; see Section 2.6 for details).
7. Click the selection button for measures within persons for the type of nesting of
input data if the level-1 data consist of repeated measures or item responses. With this
selection, WHLM will use in its displays and output model notations that match those
Co
used in Hierarchical Linear Models for studies on individual change and latent
variables (Chapters 6 and 11). The default type is persons within groups. It is
generally used when the level-1 data are comprised of cross-sectional measures. With
this option, WHLM will use model notations that correspond to those used for
applications in organization research (Chapters 4 and 5).
8. Click Browse in the Level-2 specification section to open an Open Data File dialog
box.
py
9. Open a level-2 SPSS system file in the HLM folder (HSB2.SAV in our example). The
Choose Variables button below Browse will be activated.
10. Click Choose Variables to open the Choose Variables - HLM2 dialog box and choose
the ID and variables by clicking the appropriate check boxes (see Figure 2.7).
11. Check the box include spatial dependence matrix to specify spatial dependence, if
applicable (see Section 11.4 for details). The Spatial Dependence Specification box
rig
should only be used if you have spatial dependence data and wish to run this kind of
model.
12. Enter a name for the MDM file in the MDM file name box (for example, HSB.MDM).
ht
©
SS
I
Figure 2.6 Choose Variables - HLM2 dialog box for the level-1 file, HSB1.SAV
21
Co
py
Figure 2.7 Choose variables - HLM2 dialog box for the level-2 file, HSB2.SAV
rig
13. Click Save mdmt file in the MDM template file section to open a Save MDM template
file dialog box. Enter a name for the MDMT file (for example, HSBSPSS.MDMT). Click
Save to save the file. The command file saves all the input information entered by the
user. It can be re-opened by clicking the Open mdmt file button (see Figure 2.5). To
make changes to an existing MDMT file, click the Edit mdmt file button.
ht
14. Note that HLM will also save the input information into another file called
CREATMDM.MDMT when the MDM is created.
15. Click the Make MDM button. A screen displaying the prompts and responses for MDM
creation will appear.
©
SS
I
1. When the screen disappears, the level-1 and level-2 descriptive statistics will automatically
be displayed (See Figure 2.8). Pay particular attention to the N column. It is not an
uncommon mistake to forget to sort by the ID variable, which can lead to a lot (or most) of
the data not being processed. Close the Notepad window when done. Use the Save As
option to give it a new name if later use of this file is anticipated. The file can also be opened
by clicking on the Display Stats button.
Co
2. Click Done. The WHLM window displays the type and name on its title bar (hlm2 &
HSB.MDM) and the level-1 variables on a drop-down menu (See Figure 2.9).
py
rig
Figure 2.9 WHLM: hlm2 MDM File window for HSB.MDM
To supply HLM with appropriate information for the data, the command, and the MDM files
SS
1. Click Browse in the Level-1 specification section to open an Open Data File dialog
box. Open a level-1 ASCII data file in the HLM examples folder (HSB1.DAT in our
example). The file name (HSB1.DAT) appears in the Level-1 File Name box.
2. Enter the number of variables into the Number of Variables box (4 in our example) and
the data entry format in the Data Format box (A4,4F12.3 in our example).
I
Note that the ID is included in the format statement, but excluded in the Number of Variables box.
Rules for input format statements are given in Section A.2 in Appendix A.
23
Co
py
Figure 2.10 Make MDM – HLM2 dialog box
24
Co
py
Figure 2.11 Enter Variable Labels dialog box for level-1 file, HSB1.DAT
rig
ht
©
Figure 2.12 Enter Variable Labels dialog box for level-2 file, HSB2.DAT
SS
I
25
To check whether the data have been properly read into HLM
The procedure is the same as for SPSS file input (see Section 2.5.1.1 for a complete description).
2.5.1.3 SAS transport, SYSTAT, STATA file input and other formats for raw data
For SAS transport, SYSTAT or STATA file input, a user selects either SAS 5 transport, SYSTAT or
STATA from the Input File Type drop-down menu as appropriate to open the Open Data File dialog
Co
box. With the third-party software module included in the current version, HLM will read data from
EXCEL, LOTUS and many other formats. Select Anything else from the Input File Type drop-down
menu before clicking on the Browse button in the input file specifications sections. If the data type
is set on the File, Preferences screen, the program will default to your selected type for both input
data and residual files.
The procedure for executing analyses based on the MDM file is described below.
©
Step 1: To specify the level-1 prediction model
26
Co
py
Figure 2.13 Model window for the HS&B example
4. Click on the name of a predictor variable and click the type of centering (SES and add
variable group centered, see Figure 2.14). The predictor will appear on the equation
screen and each regression coefficient associated with it will become an outcome in the
rig
Level-2 model (see Figure 2.15).
ht
©
Figure 2.14 Specification of model predictor, SES, for the HS&B example
SS
I
1. Select the equation containing the regression coefficient(s) to be modeled by clicking on the
equation ( β 0 (intercept) and β1 (SES slope) in our HS&B example). A listbox for level-2
variables (>>Level-2<<) will appear (see Figure 2.16).
2. Click to select the variable(s) to be entered as predictor(s) and the type of centering. For our
example, select SECTOR and add variable uncentered, and MEANSES and add variable grand-
mean centered to model β 0 and β1 , see Figure 2.16.
Co
3. HLM allows the model to be displayed in three alternative forms. Figure 2.17 displays the model
specified in the default notation familiar to users of previous versions of HLM.
py
rig
Figure 2.16 Specification of the level-2 model
ht
©
Preferences dialog box accessible via the File menu (see details in Section 2.8) both the mixed
model formulation and the model with subscripts for all coefficients can be displayed
automatically. The model can also be saved as an EMF file for later use in reports or papers.
28
Co
py
Figure 2.18 Alternative model window for the HS&B example
Steps 1 to 3 are the three major steps for executing analyses based on the MDM file. Other analytic
options are described in Section 2.9. After specifying the model, a title can be given to the output
and the output file can be named by the following procedure:
SS
1. Select Basic Settings to open the Basic Model Specifications – HLM2 dialog box.
Enter a title in the Title field (for example, Intercept and slopes-as-Outcomes Model) and an
output file name in Output file name field (see Figure 2.19). Click OK. See Section 2.8
for the definitions of entries and options in Basic Model Specifications – HLM2 dialog
box.
2. Open the File menu and choose Save As to open a Save command file dialog box.
I
Note: If you wish to terminate the computations early, press the Ctrl-C key combination once. This
will stop the analysis after the current iteration and provide a full presentation of results based on
29
that iteration. If you press Ctrl-C more than once, however, computation is terminated immediately
and all output is lost.
Co
py
rig
ht
Figure 2.19 Basic Model Specifications – HLM2 dialog box for the HS&B example
©
SS
30
2.5.3 Annotated HLM2 output
The output file will automatically be displayed in the format specified via the Preference menu. It
can also be opened by selecting the View Output option from the File menu. Here is the output
produced by the Windows session described above (see example HSB1.MDM).
Level-1 Model
rig
MATHACHij = β0j + β1j*(SESij) + rij
Level-2 Model
Mixed Model
©
MATHACHij = γ00 + γ01*SECTORj + γ02*MEANSESj
+ γ10*SESij + γ11*SECTORj*SESij + γ12*MEANSESj*SESij
+ u0j + u1j*SES+ rij
The information presented on the first page or two of the HLM2 printout summarizes key details
about the MDM file (e.g., number of level-1 and level-2 units, whether weighting was specified), and
about both the fixed and random effects models specified for this run. In this particular case, we are
SS
estimating the model specified by Equations 4.14 and 4.15 in Hierarchical Linear Models.
I
31
Level-1 OLS Regressions
When first analyzing a new data set, examining the OL equations for all of the units may be helpful
in identifying possible outlying cases and bad data. By default, HLM2 does not print out the ordinary
py
least squares (OL) regression equations, based on the level-1 model. The OLS regression equations
for the first 10 units, as shown here, were obtained using optional settings on the Other Settings
menu.
This is a simple average of the OLS coefficients across all units that had sufficient data to permit a
separate OLS estimation.
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
©
INTRCPT2, γ00 12.083837 0.106889 113.050 7179 <0.001
SECTOR, γ01 1.280341 0.157845 8.111 7179 <0.001
MEANSES, γ02 5.163791 0.190834 27.059 7179 <0.001
For SES slope, β1
INTRCPT2, γ10 2.935664 0.155268 18.907 7179 <0.001
SECTOR, γ11 -1.642102 0.240178 -6.837 7179 <0.001
MEANSES, γ12 1.044120 0.299885 3.482 7179 <0.001
SS
I
32
Least-squares estimates of fixed effects
(with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 12.083837 0.169507 71.288 7179 <0.001
SECTOR, γ01 1.280341 0.299077 4.281 7179 <0.001
MEANSES, γ02 5.163791 0.334078 15.457 7179 <0.001
For SES slope, β1
Co
INTRCPT2, γ10 2.935664 0.147576 19.893 7179 <0.001
SECTOR, γ11 -1.642102 0.237223 -6.922 7179 <0.001
MEANSES, γ12 1.044120 0.332897 3.136 7179 0.002
The first of the fixed effects tables are based on OLS estimation. The second table provides robust
standard errors. Note that the standard errors associated with γ 00 , γ 01 , and γ 12 are smaller than their
robust counterparts.
py
The least-squares likelihood value = -2.336211E+004
Deviance = 46724.22267
Number of estimated parameters = 1
Starting Values
rig
σ2(0) = 36.72025
τ(0)
INTRCPT1,β0 2.56964 0.28026
SES,β1 0.28026 -0.01614
ht
New τ(0)
INTRCPT1,β0 2.56964 0.28026
SES,β1 0.28026 -0.01614
©
The initial starting values failed to produce an appropriate variance-covariance matrix (τ(0)). An
automatic fix-up was introduced to correct this problem (New τ(0)).
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
SS
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 12.094864 0.204326 59.194 157 <0.001
SECTOR, γ01 1.226266 0.315204 3.890 157 <0.001
MEANSES, γ02 5.335184 0.379879 14.044 157 <0.001
For SES slope, β1
INTRCPT2, γ10 2.935219 0.168674 17.402 157 <0.001
I
Above are the initial estimates of the fixed effects. These are not to be used in drawing substantial
conclusions.
33
The value of the log-likelihood function at iteration 1 = -2.325199E+004
The value of the log-likelihood function at iteration 2 = -2.325182E+004
The value of the log-likelihood function at iteration 3 = -2.325174E+004
The value of the log-likelihood function at iteration 4 = -2.325169E+004
The value of the log-likelihood function at iteration 5 = -2.325154E+004
...
The value of the log-likelihood function at iteration 57 = -2.325094E+004
The value of the log-likelihood function at iteration 58 = -2.325094E+004
The value of the log-likelihood function at iteration 59 = -2.325094E+004
The value of the log-likelihood function at iteration 60 = -2.325094E+004
Co
Below are the estimates of the variance and covariance components from the final iteration and
selected other statistics based on them.
The next three tables present the final estimates for: the fixed effects with GLS and robust standard
errors, variance components at level-1 and level-2, and related test statistics.
©
Final estimation of fixed effects:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 12.095006 0.198717 60.865 157 <0.001
SS
SECTOR, γ01 1.226384 0.306272 4.004 157 <0.001
MEANSES, γ02 5.333056 0.369161 14.446 157 <0.001
For SES slope, β1
INTRCPT2, γ10 2.937787 0.157119 18.698 157 <0.001
SECTOR, γ11 -1.640954 0.242905 -6.756 157 <0.001
MEANSES, γ12 1.034427 0.302566 3.419 157 <0.001
I
34
Final estimation of fixed effects
(with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 12.095006 0.173688 69.637 157 <0.001
SECTOR, γ01 1.226384 0.308484 3.976 157 <0.001
Co
MEANSES, γ02 5.333056 0.334600 15.939 157 <0.001
For SES slope, β1
INTRCPT2, γ10 2.937787 0.147615 19.902 157 <0.001
SECTOR, γ11 -1.640954 0.237401 -6.912 157 <0.001
MEANSES, γ12 1.034427 0.332785 3.108 157 0.002
The first table provides model-based estimates of the standard errors while the second table provides
py
robust estimates of the standard errors. Note that the two sets of standard errors are similar. If the
robust and model-based standard errors are substantively different, it is recommended that the
tenability of key assumptions should be investigated further (see Section 4.3 on examining
residuals).
rig
Final estimation of variance components
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1, u0 1.54271 2.37996 157 605.29503 <0.001
SES slope, u1 0.38590 0.14892 157 162.30867 0.369
level-1, r 6.05831 36.70313
ht
Statistics for current covariance components model
Deviance = 46501.875643
Number of estimated parameters = 4
©
2.5.4 Model checking based on the residual file
HLM2 provides the data analyst with a means of checking the fit and distributional assumptions of
the model by producing residual files for the level-1 and level-2 models. These files may be
requested using the Basic Model Specifications – HLM2 dialog box (see Fig. 2.19). The level-1
and level-2 residual files will be written as SPSS, SAS, STATA, SYSTAT or ASCII data files. In the
SS
case of SPSS and STATA, the residual files will be written out so that the respective packages may
use them immediately. The other forms of raw data will require submitting them as command
streams.
The level-1 residual file will contain level-1 residuals (the differences between the observed and
fitted values), the fitted values, the square root of σ 2 , the values of the level-1 and level-2 predictors
35
entered in the model, and those of other level-1 and level-2 variables selected by the user. To
illustrate, we show how to prepare SPSS residual files.
1. Select Basic Settings to open the Basic Model Specifications – HLM2 dialog box.
2. Click Level-1 Residual File to open a Create Level-1 Residual File dialog box (see
Figure 2.21).
Co
3. For the level-1 and level-2 variables, the box displays two columns of variables. The
predictor variables in the model are in the Variables in residual file column. Others are
listed in the Possible choices column. To include any of them in the residual file for
exploratory purposes, double-click on their labels.
4. Select SPSS residual file type (default).
5. Enter a name for the residual file in the Residual File Name box (for example,
RESFIL1.SAV, see Figure 2.21). Click OK.
py
rig
ht
©
36
Co
py
Figure 2.22 Level-1 Residual File
Data for the first ten cases in RESFIL1.SAV are shown in Figure 2.22. The file consists of the level-2
rig
ID, L2ID, and the following variables:
• L1RESID: the difference between the fitted and observed value for each level-1 unit.
• FITVAL: the fitted value for each level-1 unit.
• SIGMA: the square root of σ 2 .
ht
The variables SES, MATHACH, SECTOR, and MEANSES are described in Section 2.5.1.1.
We illustrate a possible use of a residual file in examining the tenability of the assumption of normal
©
distribution of level-1 errors, whose violations could adversely influence the estimated standard
errors for the estimates of the fixed effects and inferential statistics (see Hierarchical Linear Models
p. 266). Figure 2.23 displays a normal Q-Q plot of the level-1 residuals for the 7,185 students based
on the final fitted model. The plot is approximately linear, suggesting there is not a serious departure
from a normal distribution and that the assumption is tenable.
SS
I
37
Co
py
Figure 2.23 Q-Q plot of level-1 residuals
rig
2.5.4.2 The level-2 residual file
This file will contain the EB residuals (see Equation 1.10 above), OL residuals (see Equation 1.9
above), and fitted values, i.e.,
ht
γ q 0 + γ q sWs j
for each level-1 coefficient. By adding the OL residuals to the corresponding fitted values, the
analyst can also obtain the OL estimate of the corresponding level-1 coefficient β q j . The file also
produces the EB estimate β q*j of each level-1 coefficient, β q j .
©
In addition, the file will contain Mahalanobis distances (which are discussed below), estimates of the
total and residual standard deviations (log metric) within each unit, the values of the predictors used
in the level-2 model, and any other level-2 prediction variables selected by the user.
SS
To create the SPSS level-2 residual file type
1. Select Basic Settings to open the Basic Model Specifications – HLM2 dialog box.
2. Click Level-2 Residual File to open a Create Level-2 Residual File dialog box.
3. Double-click the variables to be entered into the residual file (for our example, select
DISCLIM, PRACAD, HIMINTY and SIZE, see Figure 2.24).
I
4. Select SPSS as Residual File Type. Note that SYSTAT, STATA or SAS file type can be
created as well, or the residuals written to file in free format. By default, a SYSTAT file
will be created. To set the default file type created to one of the other formats, the
Preference dialog box (see Section 2.8) can be used.
38
Co
py
rig
Figure 2.24 Create Residual File dialog box
5. Enter a name for the residual file in the Residual File Name box (for example
ht
RESFIL2.SPS, see Figure 2.24). Click OK.
An example of an SPSS version of a level-2 residual file is shown in Figure 2.25. Only the data from
the first ten units and the first 8 variables are reproduced here. This file can be used to construct
various diagnostic plots.
©
2.5.4.2.1 Structure of the level-2 residual file
The residual file contains a single record per unit. The first variable in this file contains the unit ID,
followed by the number of level-1 units within that level-2 unit (denoted by nj), and various
summary statistics (chipct through mdrsvar). These are followed by the two EB residuals; the two OL
residuals; and the fitted or predicted values of the level-1 coefficients based on the estimated level-2
SS
models. Next are the EB coefficients ecintrcp and ecses, which are the sum of the fitted values plus
the EB residuals. The posterior variances and covariances of the estimates of the intercept and the
SES slopes are given next (pv00 to pvc10). Finally, the level-2 predictors used in the analysis plus
those additional level-2 predictors requested by the user for inclusion in the file are given (not
shown in Figure 2.24).
I
While most of this is straightforward, the information contained in the first set of variables for each
unit merits elaboration. nj is the number of cases for level-2 unit j . It is followed by two variables,
chipct and mdist. If we model q level-1 coefficients, mdist would be the Mahalanobis distance (i.e.,
the standardized squared distance of a unit from the center of a v -dimensional distribution, where v
39
is the number of random effects per unit). Essentially, mdist provides a single, summary measure of
γ q0 +
the distance of a unit's EB estimates, β q*j , from its "fitted value," γ q 0Ws j .
Co
py
rig
Figure 2.25 SPSS version of residual file
If the normality assumption is true, then the Mahalanobis distances should be distributed
approximately χ (2v ) . Analogous to univariate normal probability plotting, we can construct a Q-Q plot
of mdist vs. chipct. chipct are the expected values of the order statistics for a sample of size J selected
ht
from a population that is distributed χ (2v ) . If the Q-Q plot resembles a 45 degree line, we have
evidence that the random effects are distributed v-variate normal. In addition, the plot will help us
detect outlying units (i.e., units with large mdist values well above the 45 degree line). It should be
noted that such plots are good diagnostic tools only when the level-1 sample sizes, nj, are at least
moderately large. (For further discussion see Hierarchical Linear Models, pp. 274-280.)
©
After mdist, three estimates of the level-1 variability are given:
• The natural logarithm of the total standard deviation within each unit, lntotvar.
• The natural logarithm of the residual standard deviation within each unit based on its least
squares regression, olsrsvar. Note, this estimate exists only for those units that have sufficient
data to compute level-1 OLS estimates.
SS
• The mdrsvar, the natural logarithm of the residual standard deviation from the final fitted
fixed effects model.
The natural log of these three standard deviations (with the addition of a bias-correction factor for
varying degrees of freedom) is reported (see Hierarchical Linear Models, p. 219). We note that
these statistics can be used as input for the V-known option in HLM2 in research on group-level
correlates of diversity (Raudenbush & Bryk, 1987; also see Sections 2.8.9 and 9.3).
I
We illustrate below some of the possible uses of a level-2 residual file in examining the adequacy of
fitted models and in considering other possible level-2 predictor variables. (For a full discussion of
40
this topic see Chapter 9 of Hierarchical Linear Models.) Here are the basic statistics for each of the
variables created as part of the HLM2 residual file.
41
Co
py
Figure 2.26 OL versus EB residuals for the SES slopes
Exploring the potential of other possible level-2 predictors. Figure 2.27 shows a plot of EB
residuals against a possible additional level-2 predictor, PRACAD, for the intercept model. Although
the relationship appears slight (a correlation of 0.15), PRACAD will enter this model as a significant
rig
predictor. (For a further discussion of the use of residual plots in identifying possible level-2
predictors see Hierarchical Linear Models, pp. 267-270.)
ht
©
SS
Figure 2.27 EB residuals against a possible additional level-2 predictor, PRACAD, for the
intercept model
Next, in Figure 2.28, we see a plot of the OL vs EB residuals for the intercepts. Notice that while the
EB intercepts are "shrunk" as compared to the OL estimates, the amount of shrinkage for the
intercepts as shown in Figure 2.28 is far less than for the SES slopes as shown in Figure 2.26.
I
42
Co
py
Figure 2.28 OL versus EB residuals for the intercepts
HLM2 provides three options for handling missing data at level 1: listwise deletion of cases when the
MDM file is made, listwise deletion of cases when running the analysis (See Figure 2.3), and analysis
of multiply-imputed data (see Section 11.2). A set of level-1 variables to be used as basis for runtime
43
deletion for a series of models based on the same MDM can also be selected via the Other Settings,
Estimation Settings menu by using the Level-1 Deletion Variables option. These follow the
conventional routines used in standard statistical packages for regression analysis and the general
linear model. Listwise deletion of cases when the MDM file is made is based on the variables selected
for inclusion in the MDM file, while listwise deletion when running the analysis only takes the
variables included in the model into account.
At level 2, HLM2 assumes complete data. If you have missing data at level 2, you should either
Co
impute a value for the missing information or delete the units in question, or preferably use methods
described in Section 11.2. Failure to do so will cause the automatic listwise deletion of level-2
units with missing data when the MDM file is created.
For ASCII file input, click Missing Data in the Make MDM – HLM2 dialog box. The dialog box
displayed in Fig. 2.30 will open.
py
rig
ht
Assuming you have missing data, you should click Yes in the Missing Data? box, and select
©
deletion when making the MDM file or when running analyses. Then, if you have coded all of your
missing values for all of the variables to the same number, click the Same button. When you specify
the variable names, enter this number in the box to right of the first variable in the Enter Variable
Labels dialog box (see Fig. 2.31). If you have more than one missing value code, check the
Different button, and enter these codes for each respective variable on the Enter Variable Labels
screen.
SS
I
44
Co
py
Figure 2.31 Enter Variable Labels dialog box for missing ASCII data
rig
For non-ASCII data at level 1, you should click Yes in the Missing Data? field, and select when you
want to implement the listwise deletion by selecting one of the two options in this groupbox. Then,
when HLM2 encounters values coded as missing, it will recognize these properly. It is important to
note that some statistics packages (e.g. SAS) allow for more than one kind of missing data code.
HLM2 (and HLM3, etc.) will recognize only the standard, "system-missing" code.
ht
How HLM2 handles missing data differs a bit in the ASCII and non-ASCII cases. For ASCII data, it is
very important that you don't have any missing data codes or blanks in the level 2 file. HLM2 will
read these as valid data; missing data codes as they are coded, and blanks will be read as zeros. For
non-ASCII data, the program will skip over cases that have missing data in them, essentially
performing listwise deletion on the level-2 data file. Note: For non-ASCII file input, the user has to
©
either prepare system-missing values or missing value codes for the missing data.
45
Co
py
Figure 2.32 Basic Model Specifications - HLM2 dialog box
rig
2.8 Other analytic options
The iterative procedure settings can be changed by opening the Iteration Control – HLM2 dialog
box. To do so, select the Iteration Settings option from the Other Settings menu. Table 2.1 lists
the definitions and options in the Iteration Control – HLM2 dialog box. See Fig. 2.33; note the
I
46
Table 2.1 Table of definitions and options in Iteration Control - HLM2 dialog box
The Estimation Settings – HLM2 dialog box, accessed via the Estimation Settings option on the
Other Settings menu, offers additional control over the iterative procedure.
I
HLM2 will use restricted maximum likelihood estimation by default. The type of likelihood used is
set in the Type of Likelihood group box (see Fig. 2.34), where full maximum likelihood estimation
may alternatively be requested (see Hierarchical Linear Models, pp. 52-53.)
47
Full maximum Adaptive Gaussian Quadrature and LaPlace and EM LaPlace iterations may be requested
when nonlinear (HGLM) models are fitted. The maximum number of iterations required, which has to
be a positive integer, should be entered in the LaPlace Iteration Control or EM LaPlace Iteration
Control group box (see Fig. 2.34).
The Estimation Settings – HLM2 dialog box may also be used to access dialog boxes used in
defining special analyses, e.g. latent variable regression, applying HLM to multiply-imputed data, and
plausible value analysis.
Co
These special features, associated with the Plausible values, Multiple imputation and Latent
Variable Regression buttons in the Estimation Settings – HLM2 dialog box, are discussed in
Chapter 11.
where X 1i j is an indicator for females, X 2i j is an indicator for males, and ri j is a measurement error.
Hence β1 j is the "true score" for females and β 2 j is the "true score" for males. At level 2, these true
ht
scores are modeled as a function of predictor variables, one of which was marital role quality, W j , a
measure of one's satisfaction with one's marriage. (Note that this is also a model without a level-1
intercept.) A simple level-2 model is then:
β1 j = γ 10 + γ 11W j + u1 j
©
β 2 j = γ 20 + γ 21W j + u2 j .
The four coefficients to be considered are γ 10 , γ 11 , γ 20 , γ 21. We may, for instance, wish to specify
some constraints of fixed effects.
SS
Coefficients with 0s are not constrained, and those with 1s are. A user is allowed to impose multiple
constraints up to 5. Each set of the constrained coefficients will share the same value from 1 to 5.
48
Co
py
Figure 2.35 Constrain Gammas dialog box for the Barnett et al.'s (1993) example
rig
2.8.5 Modeling heterogeneity of level-1 variances
Users may wish to estimate models that allow for heterogeneous level-1 variances. A simple
example (see HSB3.HLM) using the HS&B data would be a model that postulates that the two
genders have different means in and variances of math achievement scores. To specify a model that
hypothesizes different central tendency and variability in math achievement for the two genders, the
ht
model displayed in Fig. 2.36 must first be set up.
1. Open the Other Settings menu and select the Estimation Settings option to open the
©
Estimation Settings – HLM2 dialog box.
2. Click the Heterogeneous sigma^2 button to open the Heterogeneous sigma^2
Predictors of level-1 variance dialog box. Double-click FEMALE to enter as a variable
in the Predictors of level-1 variance box (see Figure 2.37 for an example). Click OK.
SS
I
Figure 2.36 Model window for the modeling heterogeneity of level-1 variances example
49
Co
py
rig
ht
Figure 2.37 Heterogeneous sigma^2: Predictors of level-1 variance dialog box
The model estimated is a log linear-model for the level-1 variances, which can be generally stated
as:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
SS
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 13.345271 0.253915 52.558 159 <0.001
For FEMALE slope, β1
INTRCPT2, γ10 -1.359401 0.171411 -7.931 7024 <0.001
I
50
Final estimation of fixed effects
(with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 13.345271 0.260426 51.244 159 <0.001
For FEMALE slope, β1
INTRCPT2, γ10 -1.359401 0.185181 -7.341 7024 <0.001
Co
Final estimation of variance components
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1, u0 2.84757 8.10864 159 1601.08000 <0.001
level-1, r 6.23256 38.84483
py
Statistics for the current model
Deviance = 47051.483085
Number of estimated parameters = 4
Standard
ht
Parameter Coefficient Z-ratio p-value
Error
INTRCPT1 ,α0 3.70771 0.024645 150.444 0.000
FEMALE ,α1 -0.09307 0.034023 -2.736 0.007
The Z-ratio for γ 0 (Z = -7.341) and Z-ratio for α1 (Z =-2.736) for FEMALE indicate that the math
achievement scores of males are on average higher as well as more variable than those for females.
Furthermore, a comparison of the fits of the models suggests that the model with heterogeneous
within-school variances appears appropriate ( χ 2 = 7.45604, df = 1). See Chapter 10 in this manual
I
51
2.8.6 Specifying level-1 deletion variables
If, when making the MDM file, "Delete missing data when running analyses" was specified, this
feature may be used to alter the default behavior of the programs. By default, the programs will
delete missing data on the basis of the level-1 variables actually in the model. While in many cases
this is the desired behavior, in other situations it may not be. For instance, one might be running and
comparing analyses that have different level-1 models. With many datasets, this can lead to
comparing results that have a different number of level-1 records used. To solve this problem, check
Co
the option to delete missing data "when making the MDM file" (see Figure 2.30).
Suppose, for instance, that in a pre-election poll, ethnic minority voters are over-sampled to insure
that various ethnic groups are represented in the sample. Without weighting, the over-sampled
groups would exert undue influence on estimates of the proportion of voters in the population
rig
favoring a specific candidate. Use of design weights can yield unbiased estimates of the population
parameters.
Design weights are also commonly used to correct for differential non-response of sub-groups.
Response rates are estimated for relevant sub-groups, and information from each respondent is
ht
weighted inversely proportional to the probability of response. That way, respondents who are over-
represented in a sample as a function of non-response are appropriately weighted down.
Hierarchical data can be described as arising from a multi-stage sampling procedure. For example,
©
schools might be sampled from a national frame of schools and then, within each school, students
might then be sampled from a list of all students attending the school. Probabilities at each level
might be known but unequal. For example, one might over-sample private schools and then over-
sample minority students within each school. Weights might be constructed at each level to be
inversely proportional to the probability of selection at that level. In some cases, weights might be
available at only one level. For example, in a two-level design with students nested within schools,
SS
one might compute the marginal probability that a student is selected as the product of the
probability that student's school is selected multiplied by the conditional probability that the student
is selected given that his or her school is selected. In another context, suppose persons are selected
with known probability and then followed longitudinally over time. In this case, we have occasions
at level 1 nested within persons at level 2. The only weight may be a level-2 weight, inversely
proportional to the probability of selection of that person. It is, of course, possible to include level-1
weights as well, but it is common to have weights only at level-2 in such longitudinal studies.
I
HLM 7 uses a method of computation devised by Pfefferman et al. (1998) for hierarchical data. This
method, based on weighting the information of each case in the framework of maximum likelihood,
is more appropriate than the method of weighting in earlier versions of HLM, which used a more
conventional approach of weighting observations.
52
2.8.7.2 Weighting in two-level designs
In the two-level context, weights might be available at level 1, at level 2 or at both levels. If weights
are available at level-1 only, the methodology used in HLM 7 assumes that these weights are
inversely proportional to Pij , the marginal probability of that student i in school j is selected into the
sample. HLM 7 will then normalize the weight to have a mean of 1.0. Thus we have
N / Pij
Equation Section 2 wij =
Co
nj
(2.1)
J
1 / P
j =1 i =1
ij
in which case
J nj
py
w
J =1 i =1
ij =N (2.2)
where N is the total sample size of level-1 units. In contrast, if weights are available only at level 2,
the methodology assumes that these weights are inversely proportional to Pj the probability of
rig
selection of the level-2 unit. In this case, HLM 7 will again normalize the weight to have a mean of
1.0, yielding
J/Pij
ht
wj= , (2.3)
J
j =1
1/ Pij
in which case
J
j =1
wj = J . (2.4)
©
where J is the total number of level-2 units. If weights are available at both level-1 and level-2, the
methodology assumes that the level-1 weight is Pi| j , the conditional probability of selection of unit i
given that unit j was selected, so that Pi| j = Pij | Pj . The level-2 weight is assumed to be inversely
SS
proportional to Pj . In this case, HLM will normalize the level-1 weight within level-2 units:
n j / Pi| j
wi| j = nj
(2.5)
1 / P
i =1
i| j
I
53
nj
w
i =1
i| j = nj (2.6)
In HLM 7, weights are selected at the time of analysis, not when the MDM file is made:
1. Select the Estimation Settings option from the Other Settings menu.
2. Click the Weighting button to access the pull-down menus used to select the weighting
ht
variables at any level.
Note that the cover sheet of each HLM output reminds the user of the weighting specification chosen.
HLM allows multivariate hypothesis tests for the fixed effects. For instance, for the model displayed
in Fig. 2.39, a user can test the following composite null hypothesis:
H 0 : γ 01 = γ 11 = 0,
SS
where γ 01 is the effect of SECTOR on the intercept and γ 11 is the effect of sector on the SES slope.
I
54
Co
Figure 2.39 Model window
1. Open the Other Settings menu and select the Hypothesis Settings option to open the
Hypothesis Testing – HLM2 dialog box (See Figure 2.40).
rig
2. Click "1" to open the General Linear Hypothesis: Hypothesis 1 dialog box and to
specify the first hypothesis (see Fig 2.41 for the contrasts for testing both of the effects of
SECTOR on the intercept and on the SES slope as null, see Hierarchical Linear Models, p.
82). Then, click the "2" button for the second column and enter a 1 on the γ 11 line in the
second column. Click OK.
ht
©
SS
Figure 2.40 Optional Hypothesis Testing/Estimation dialog box
I
55
Co
py
Figure 2.41 General Linear Hypothesis: Hypothesis 1 dialog box
rig
The HLM2 output associated with this test appears in Section 2.8.8.3 below. (For a further discussion
of this multivariate hypothesis test for fixed effects see Hierarchical Linear Models, pp. 58-61, 81-
85).
HLM2 also provides, as an option, a multi-parameter test for the variance-covariance components.
This likelihood-ratio test compares the deviance statistic of a restricted model with a more general
alternative. The user must input the value of the deviance statistic and related degrees of freedom for
the alternative specification. Below we compare the variance-covariance components of two
Intercept-and-Slope-as-Outcome models. One treats β1 as random and the other does not.
I
Enter the deviance and the number of parameters in the Deviance Statistics box and in the Number
56
of Parameters box (see Fig. 2.40) respectively (the two numbers for our example are 46512.978000
and 4, obtained in Section 2.5.3).
The HLM2 output associated with this test appears in the section below. (For a further discussion of
this multi-parameter test see Hierarchical Linear Models, pp. 63-65, 83-85). Below is an example of
a selected HLM2 output that illustrates optional hypothesis testing procedures.
Co
The outcome variable is MATHACH
Level-1 Model
Mixed Model
Note, the middle section of output has been deleted. We proceed directly to the final results page.
57
Final estimation of fixed effects
(with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 12.095250 0.173679 69.641 157 <0.001
SECTOR, γ01 1.224401 0.308507 3.969 157 <0.001
MEANSES, γ02 5.336698 0.334617 15.949 157 <0.001
For SES slope, β1
Co
INTRCPT2, γ10 2.935664 0.147576 19.893 7022 <0.001
SECTOR, γ11 -1.642102 0.237223 -6.922 7022 <0.001
MEANSES, γ12 1.044120 0.332897 3.136 7022 0.002
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
py
INTRCPT1, u0 1.54118 2.37524 157 604.29895 <0.001
level-1, r 6.06351 36.76611
Deviance = 46502.952743
rig
Number of estimated parameters = 2
For the likelihood ratio test, the deviance statistic reported above is compared with the value from
the alternative model manually. The result of this test appears below.
A model that constrains the residual variance for the SES slopes, β1 , to zero appears appropriate.
©
(For a further discussion of this application see Hierarchical Linear Models, pp. 83-85.)
χ2 statistic = 244.08638
degrees of freedom = 159
p-value = 0.000
SS
These results indicate that there is variability among the (J = 160) level-2 units in terms of the
residual within-school (i.e., level-1) variance. (For a full discussion of these results see Hierarchical
Linear Models, pp. 263-267.)
I
58
Results of General Linear Hypothesis Testing - Test 1
Coefficients Contrast
For INTRCPT1, β0
INTRCPT2, γ00 12.095250 0.0000 0.0000
SECTOR, γ01 1.224401 1.0000 0.0000
MEANSES, γ02 5.336698 0.0000 0.0000
For SES slope, β1
INTRCPT2, γ10 2.935664 0.0000 0.0000
SECTOR, γ11 -1.642102 0.0000 1.0000
Co
MEANSES, γ12 1.044120 0.0000 0.0000
Estimate 1.2244 -1.6421
Standard error of estimate 0.3085 0.2372
χ2 statistic = 60.527852
Degrees of freedom = 2
p-value = <0.001
py
The table above is a reminder of the multivariate contrast specified. The chi-square statistic and
associated p-value indicate that it is highly unlikely that the observed estimates for γ 01 and γ 11 could
have occurred under the specified null hypothesis.
rig
2.9 Output options
There are a few options relating to the output that can be selected on the Other Settings, Output
Settings menu:
• # of OLS estimates shown (HLM2 only) – this controls the number of OLS estimates
ht
printed in the output. See the output in Section 2.5.3.
• Print variance-covariance matrices – see Section A.5.
• Print reduced output – if this is checked, only the header page and the final results are
printed.
Starting values, OLS estimates (if present), etc. will not be printed.
©
SS
59
2.10 Models without a level-1 intercept
In some circumstances, users may wish to estimate models without a level-1 intercept. Consider, for
example, a hypothetical study in which three alternative treatments are implemented within each of
J hospitals. One might estimate the following level-1 (within-hospital) model:
An example of a no-intercept model appears on page 174 of Hierarchical Linear Models. The
vocabulary growth of young children is of interest. Both common sense and the data indicated that
children could be expected to have no vocabulary at 12 months of age. Hence, the level-1 model
rig
contained no intercept:
where AGEt i is the age of child i at time t in months and Yt i is the size of that child's vocabulary at
ht
that time.
Click INTRCPT1 on the >>Level-1<< drop-down list. Click delete variable from model.
©
2.11 Coefficients having a random effect with no corresponding fixed
effect
A user may find it useful at times to model a level-1 predictor as having a random effect but no fixed
effect. For example, it might be that gender differences in educational achievement are, on average,
SS
null across a set of schools; yet, in some schools females outperform males while in other schools
males outperform females. In this case, the fixed effect of gender could be set to zero while the
variance of the gender effect across schools would be estimated.
The vocabulary analysis in Hierarchical Linear Models supplies an example of a level-1 predictor
I
having a random effect without a corresponding fixed effect. For the age interval under study, it was
found that, on average, the linear effect of age was zero. Yet this effect varied significantly across
children. The level-1 model estimated was:
60
Yt i = π 1i ( AGEt i − 12 ) + π 2 i ( AGEt i − 12 ) + et i
2
π 1i = r1i
π 2 i = β 20 + r2 i
Co
Notice that AGE – 12 has a random effect but no fixed effect.
The user may be interested in computing "t-to-enter statistics" for potential level-2 predictors to
guide specification of subsequent HLM2 models. The implementation procedure is as follows.
rig
To implement exploratory analysis of potential level-2 predictors
1. Open the Other Settings menu and choose Exploratory Analysis (level 2). A Select
Variables For Exploratory Analysis dialog box appears.
ht
2. Click the equation associated with a regression coefficient to model the corresponding
coefficient. Click to select variables for exploratory analysis. (Figure 2.43 displays the
level-2 predictors chosen for our HS&B example).
3. Click Return to Model Mode to return to the model window.
The following contains a selected HLM2 output to illustrate exploratory analysis of potential level-2
predictors.
©
SS
I
Figure 2.43 Select Variables For Exploratory Analysis dialog box for the HS&B example
61
Exploratory Analysis: estimated level-2 coefficients and their standard errors obtained by
regressing EB residuals on level-2 predictors selected for possible inclusion in subsequent HLM
runs
INTRCPT1,β0
SIZE PRACAD DISCLIM HIMINTY
Coefficient 0.000 0.690 -0.161 -0.543
Co
Standard Error 0.000 0.404 0.106 0.229
t-value 1.569 1.707 -1.515 -2.372
SES,β1
SIZE PRACAD DISCLIM HIMINTY
Coefficient 0.000 0.039 -0.005 -0.058
Standard Error 0.000 0.044 0.012 0.025
py
t-value 1.297 0.899 -0.425 -2.339
The results of this exploratory analysis suggest that HIMINTY might be a good candidate to include in
the INTRCPT1 model. The t-values represent the approximate result that will be obtained when one
additional predictor is added to any of the level-2 equations. This means that if HIMINTY is added to
rig
the model for the INTRCPT1, for example, the apparent relationship suggested above for HIMINTY in
the SES slope model might disappear. (For a further discussion of the use of these statistics see
discussion in Hierarchical Linear Models, p. 270 on "Approximate t-to-Enter Statistics.")
ht
©
SS
I
62
3 Conceptual and Statistical Background for Three-Level
Co Models
The models estimated by HLM3 are applicable to a hierarchical data structure with three levels of
random variation in which the errors of prediction at each level can be assumed to be approximately
normally distributed. Consider, for example, a study in which achievement test scores are collected
from a sample of children nested within classrooms that are in turn nested within schools. This data
structure is hierarchical (each child belongs to one and only one classroom and each classroom
belongs to one and only one school); and there are three levels of random variation: variation among
children within classrooms, variation among classrooms within schools, and variation among
py
schools. The outcome (achievement test scores) makes the normality assumption at level 1
reasonable, and the normality assumption at the classroom and school levels will often also be a
sensible one.
Chapter 8 of Hierarchical Linear Models discusses several applications of a three-level model. The
rig
first is a three-level cross-sectional study as described above. A second case involves time-series
data collected on each subject where the subjects are nested within organizations. This latter
example is from the Sustaining Effects Study, where achievement data were collected at five time
points for each child. Here the time-series data are nested within children and the children are nested
within schools. A third example in Chapter 8 involves measures taken on each of the multiple
classes taught by secondary school teachers. The classes are nested within teachers and the teachers
ht
within schools. A final example involves multiple items from a questionnaire administered to
teachers. The items vary "within teachers" at level 1, the teachers vary within schools at level 2, and
the schools vary at level 3. In effect, the level-1 model is a model for the measurement error
associated with the questionnaire. Clearly, there are many interesting applications of a three-level
model.
©
3.1 The general three-level model
The three-level model consists of three submodels, one for each level. For example, if the research
problem consists of data on students nested within classrooms and classrooms within schools, the
level-1 model will represent the relationships among the student-level variables, the level-2 model
SS
will capture the influence of class-level factors, and the level-3 model will incorporate school-level
effects. Formally there are i = 1, ..., n jk level-1 units (e.g., students), which are nested within each of
j = 1,..., J k level-2 units (e.g., classrooms), which in turn are nested within each of k = 1,..., K level-
3 units (e.g., schools).
I
63
Yi jk = π 0 jk + π 1 jk a1 jk + π 2 jk a2 jk + + π pjk a pjk + eijk
P (3.1)
= π 0 jk + π pjk a pjk + eijk
p =1
where
Co
π p j k (p = 0,1,..., P) are level-1 coefficients,
a p j k is a level-1 predictor p for case i in level-2 unit j and level-3 unit k,
π p j k = β p 0 k + β p1k X 1 j k + β p 2 k X 2 j k + + β pQ k X Q
p p jk + rp j k
ht
Qp (3.2)
= β p 0 k + β p q k X q j k + rp j k ,
q =1
where
We assume that, for each unit j, the vector ( r0 j k , r1 j k , , rP j k )′ is distributed as multivariate normal
SS
where each element has a mean of zero and the variance of rp j k is:
Var ( rp j k ) = τ π pp . (3.3)
These level-2 variance and covariance components can be collected into a dispersion matrix, Tπ ,
with a maximum dimension is ( P + 1) × ( P + 1) .
64
We note that each level-1 coefficient can be modeled at level 2 as one of three general forms:
1. a level-1 coefficient that is fixed at the same value for all level-2 units; e.g.,
π p j k = β p0k , (3.5)
Co
2. a level-1 coefficient that varies non-randomly among level-2 units, e.g.,
Qp
π p j k = β p 0 k + β p q k X q jk , (3.6)
q =1
py
3. a level-1 coefficient that varies randomly among level-2 units, e.g.,
π p j k = β p 0 k + rp j k (3.7)
or
Qp
rig
π p j k = β p 0 k + β p q k X q j k + rp j k . (3.8)
q =1
The actual dimension of Tπ in any application depends on the number of level-1 coefficients
specified as randomly varying. We also note that a different set of level-2 predictors may be used in
each of the P + 1 equations that form the level-2 model.
ht
3.1.3 Level-3 model
Each of the level-2 coefficients, β p q k , defined in the level-2 model becomes an outcome variable in
the level-3 model:
©
β pq k = γ p q 0 + γ p q1W1k + γ p q 2W2 k + + γ p q S WS
pq pq k + upqk
Spq (3.9)
= γ p q 0 + γ p q sWs k + u p q k ,
s =1
where
γ p q s ( s = 0, 1, , S p q ) are level-3 coefficients,
SS
We assume that, for each level-3 unit, the vector of level-3 random effects (the u p q k terms) is
distributed as multivariate normal, with each having a mean of zero and with covariance matrix Tβ ,
whose maximum dimension is:
65
p p
(Q p + 1) × (Q p + 1) ,
p =0 p =0
(3.10)
We note that each level-2 coefficient can be modeled at level-3 as one of three general forms:
Spq
β pq k = γ p q 0 + γ p q sWs k , (3.12)
py
s =1
β pq k = γ p q 0 + u p q k (3.13)
rig
or
Spq
β pq k = γ p q 0 + γ p q sWs k + u p q k . (3.14)
s =1
ht
The actual dimension of Tβ in any application depends on the number of level-3 coefficients
specified as randomly varying. We also note that a different set of level-3 predictors may be used in
each equation of the level-3 model.
©
3.1 Parameter estimation
Three kinds of parameter estimates are available in a three-level model: empirical Bayes estimates
of randomly varying level-1 and level-2 coefficients; maximum-likelihood estimates of the level-3
coefficients (note: these are also generalized least squares estimates); and maximum-likelihood
estimates of the variance-covariance components. The maximum-likelihood estimate of the level-3
SS
coefficients and the variance-covariance components are printed on the output for every run. The
empirical Bayes estimates for the level-1 and level-2 coefficients may optionally be saved in the
"residual files" at levels 2 and 3, respectively. Reliability estimates for each random level-1 and
level-2 coefficient are always produced. The actual estimation procedure for the three-level model
differs a bit from the default two-level model. By default, HLM2 uses a "restricted maximum
likelihood" approach in which the variance-covariance components are estimated by means of
I
maximum likelihood and then the fixed effects (level-2 coefficients) are estimated via generalized
least squares given those variance-covariance estimates. In HLM3, not only the variance-covariance
components, but also the fixed effects (level-3 coefficients) are estimated by means of maximum
likelihood. This procedure is referred to as "full" as opposed to "restricted" maximum likelihood
(For a further discussion of this see Hierarchical Linear Models, pp. 52-53). Note that full maximum
likelihood is also available as an option for HLM2.
66
3.2 Hypothesis testing
As in the case of the two-level program, the three-level program routinely prints standard errors and
t-tests for each of the level-3 coefficients ("the fixed effects") as well as a chi-square test of
homogeneity for each random effect. In addition, optional "multivariate hypothesis tests" are
available in the three-level program. Multivariate tests for the level-3 coefficients enable both
omnibus tests and specific comparisons of the parameter estimates just as described in the section
Multivariate hypothesis tests for fixed effects in this chapter. Multivariate tests regarding alternative
Co
variance-covariance structures at level 2 or level 3 proceed just as in the section Multivariate tests of
variance-covariance components specification in this chapter.
The use of full maximum likelihood for parameter estimation in HLM3 has a consequence for
hypothesis testing. For both restricted and full maximum likelihood, one can test alternative
variance-covariance structures by means of the likelihood-ratio test as described in the section
py
Multivariate tests of variance-covariance components specification. However, in the case of full
maximum likelihood, it is also possible to test alternative specifications of the fixed coefficients by
means of a likelihood-ratio test. In fact, any pair of nested models can be compared using the
likelihood-ratio test under full maximum likelihood. By nested models, we refer to a pair of models
in which the simpler model can be derived by imposing constraints on the parameters of the more
complex model. Any pair of nested two-level models can be compared using a likelihood ratio test.
rig
ht
©
SS
I
67
4 Working with HLM3
As in the case of the two-level program, data analysis by means of the HLM3 program will typically
involve three stages:
Co
• Construction of an MDM file (the multivariate data matrix)
• Execution of analyses based on the MDM file
• Evaluation of fitted models based on residual files
As in HLM2, HLM3 analyses can be executed in Windows, interactive, and batch modes. We describe
py
a Windows execution below. We consider interactive and batch execution in Appendix B. A number
of special options are presented at the end of the chapter.
Data input requires a level-1 file (in our illustration a time-series data file), a level-2 file (child-level
file), and a level-3 (school-level) file.
Level-1 file. The level-1 file, EG1.SAV, has 7242 observations collected on 1721 children
I
beginning at the end of grade one and followed up annually thereafter until grade six. There are four
level-1 variables (not including the schoolid and the childid). Time-series data for the first two
children are shown in Figure 4.1.
68
There are eight records listed, three for the first child and five for the second. (Typically there are
four or five observations per child with a maximum of six.) The first ID is the level-3 (i.e., school) ID
and the second ID is the level-2 (i.e., child) ID. We see that the first record comes from school 2020
and child 273026452 within that school. Notice that this child has three records, one for each of
three measurement occasions. Following the two ID fields are that child's values on four variables:
We see that the first child, child 27306452 in school 2020, had values of 0.5, 1.5, and 2.5 on year.
Clearly, that child had no data at the first three data collection waves (because we see no values of
−2.5 , −1.5 , or −0.5 on year), but did have data at the last three waves. We see also that this child
SS
was not retained in grade during this period since the values for GRADE increase by 1 each year and
since RETAINED takes on a value of 0 for each year. The three MATH scores of that child (1.15, 1.13,
2.30) show no growth in time period 1.5. Oddly enough, the time-series record for the second child
(child 273030991 in school 2020) displays a similar pattern in the same testing.
I
Note: The level-1 and level-2 files must also be sorted in the same order of level-2 ID nested
within level-3 ID, e.g., children within schools. If this nested sorting is not performed, an
incorrect multivariate data matrix file will result.
69
Level-2 file. The level-2 units in the illustration are 1721 children. The data are stored in the file
EG2.SAV. The level-2 data for the first eight children are listed below. The first field is the schoolid
and the second is the childid. Note that each of the first ten children is in school 2020.
We see, for example, that child 273026452 is a Hispanic male (FEMALE = 0, BLACK = 0, HISPANIC
= 1).
py
rig
ht
Figure 4.2 First eight children in EG2.SAV
Level-3 file. The level-3 units in the illustration are 60 schools. Level-3 data for the first seven
schools are printed below. The full data are in the file EG3.SAV. The first field on the left is the
©
schoolid.There are three level-3 variables:
• SIZE, number of students enrolled in the school
• LOWINC, the percent of students from low income families
• MOBILE, the percent of students moving during the course of a single academic year
SS
We see that the first school, school 2020, has 380 students, 40.3% of whom are low income. The
school mobility rate is 12.5%.
I
70
Co
Figure 4.3 First seven schools in EG3.SAV
py
In sum, there are four variables at level 1, three at level 2 and three at level 3. Note that the ID
variables do not count as variables. Once the user has identified the two sets of IDs, the number of
variables in each file, the variable names, and the filenames, creation of the MDM file is exactly
analogous to the three major steps described in the Section 2.5.1.1. The user first informs HLM that
the input files are SPSS system files and the MDM is a three-level file. Then HLM is supplied with the
appropriate information for the data. Note that the three files are linked by level-2 and level-3 IDs
rig
here.
ht
©
SS
I
71
Co
py
Figure 4.5 Choose Variables – HLM3 dialog box for level-1 file, EG1.SAV
Note: In addition, the program can handle missing data at level-1 only, with the same options
rig
available as discussed in HLM2. HLM3 will listwise delete cases with missing data at levels two and
three. The three level program handles design weights at all three levels.
The response file, EGSPSS.MDMT, contains a log of the input responses used to create the MDM file,
EG.MDM, using EG1.SAV, EG2.SAV, and EG3.SAV. Figure 4.4 displays the dialog box used to create
the MDM file. Figure 4.5 shows the dialog box for the level-1 file, EG1.SAV.
ht
Note: As in the case of HLM2, after constructing the MDM file, you should check whether the data
have been properly read into HLM by examining the descriptive statistics of the MDM file.
72
Co
py
rig
Figure 4.6 Make MDM – HLM3 dialog box for EGASCII.MDM
Once the MDM file is constructed, it is used as input for the analysis. Model specification via the
Windows mode has five steps:
1. Specification of the level-1 model. In our case we shall model mathematics achievement (MATH)
as the outcome, to be predicted by YEAR in the study. Hence, the level-1 model will have two
I
coefficients for each child: the intercept and the YEAR slope.
2. Specification of the level-2 prediction model. Here each level-1 coefficient – the intercept and
the YEAR slope in our example – becomes an outcome variable. We may select certain child
characteristics to predict each of these level-1 coefficients. In principle, the level-2 parameters
then describe the distribution of growth curves within each school.
73
Co
py
rig
ht
Figure 4.7 Model Window for the public school example
Following the five steps above, we first specify a model with no child- or school-level predictors.
SS
The Windows execution is very similar to the one for HLM2 as described in Section 2.5.2. The
command file, EG1.HLM, contains the model specification input responses. To open the command
file, open the File menu and choose Edit/Run old command file. Figure 4.7 displays the model
specified in both standard and mixed model notation.
Here is the output produced by the model described above. The first page of the output gives the
specification of the model.
74
Problem Title: UNCONDITIONAL LINEAR GROWTH MODEL
The data source for this run = EG.MDM Name of the MDM file
The command file for this run = eg1.mlm Name of the command file
Output file name = hlm3.html Name of this output file
The maximum number of level-1 units = 7230 There are 7230 observations
The maximum number of level-2 units = 1721 There are 1721 children
The maximum number of level-3 units = 60 There are 60 schools
The maximum number of iterations = 100
Co
Method of estimation: full maximum likelihood
Level-1 Model
Level-2 Model
py
π0jk = β00k + r0jk
π1jk = β10k + r1jk
Level-3 Model
Mixed Model
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
SS
75
Least-squares estimates of fixed effects (with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3, γ000 -0.827685 0.072631 -11.396 7228 <0.001
For YEAR slope, π1
For INTRCPT2, β10
INTRCPT3, γ100
Co
0.765828 0.018892 40.537 7228 <0.001
For starting values, data from 7230 level-1 and 1721 level-2 records were used
py
Starting Values
σ2(0) = 0.29710
τπ(0)
INTRCPT1,π0 0.71125 0.05143
rig
YEAR,π1 0.05143 0.01582
τβ(0)
INTRCPT1 YEAR
INTRCPT2,β00 INTRCPT2,β10
0.14930 0.01473
ht
0.01473 0.01196
σ2 = 0.30148
Standard error of σ2 = 0.00660
SS
τπ
INTRCPT1,π0 0.64049 0.04676
YEAR,π1 0.04676 0.01122
I
76
Standard errors of τπ
INTRCPT1,π0 0.02515 0.00499
YEAR,π1 0.00499 0.00196
τπ (as correlations)
INTRCPT1,π0 1.000 0.551
YEAR,π1 0.551 1.000
Co
Note that the estimated correlation between true status at YEAR = 3.5 (halfway through third grade)
and true rate of change is estimated to be 0.551 for children in the same school.
Standard errors of τβ
INTRCPT1 YEAR
INTRCPT2,β00 INTRCPT2,β10
0.03641 0.00720
ht
0.00720 0.00252
τβ (as correlations)
INTRCPT1/INTRCPT2,β00 1.000 0.399
YEAR/INTRCPT2,β10 0.399 1.000
©
Notice that the estimated correlation between true school mean status at YEAR = 3.5 and true school-
mean rate of change is 0.399.
Reliabilities of school-level parameter estimates. These indicate the reliability with which we can
discriminate among level-2 units using their least-squares estimates of β 0 and β1 . Low reliabilities
do not invalidate the HLM analysis. Very low reliabilities (e.g., < 0.10), often indicate that a random
coefficient might be considered fixed in subsequent analyses.
I
77
Final estimation of fixed effects:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3, γ000 -0.779309 0.057829 -13.476 59 <0.001
For YEAR slope, π1
For INTRCPT2, β10
Co
INTRCPT3, γ100 0.763029 0.015263 49.993 59 <0.001
The above table indicates that the average growth rate is significantly positive at 0.763 logits per
year, t = 49.997.
Note that the results with and without robust standard errors are nearly identical. If the robust and
model-based standard errors are substantially different, further investigation of the tenability of key
assumptions (see Section 4.3 on examining residuals) is recommended.
ht
Final estimation of level-1 and level-2 variance components
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1,r0 0.80030 0.64049 1661 13679.62589 <0.001
YEAR slope,r1 0.10595 0.01122 1661 2132.50756 <0.001
©
level-1, e 0.54907 0.30148
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/INTRCPT2,u00 0.40658 0.16531 59 488.30922 <0.001
SS
The results above indicate significant variability among schools in terms of mean status at YEAR =
3.5 (χ2 = 488.34499, df = 59) and in terms of school-mean rates of change (χ2 of 377.40852, df =
59).
I
Deviance = 16326.231407
Number of estimated parameters = 9
78
Exploratory Analysis: estimated level-2 coefficients and their standard errors obtained by
regressing EB residuals on level-2 predictors selected for possible inclusion in subsequent HLM
runs
Exploratory Analysis: estimated level-3 coefficients and their standard errors obtained by
regressing EB residuals on level-3 predictors selected for possible inclusion in subsequent HLM
runs
HLM3 produces three residual files, one each at levels 1 and 2 (see Chapter 2 for a discussion of
these files) and one at level-3 (containing estimates of the β s). These files will contain the EB
residuals defined at the various levels, fitted values, and OLS residuals, and EB coefficients. In
addition, level-2 predictors can be included in the level-2 residual file and level-3 predictors in the
©
level-3 residual file. However, other statistics provided in the residual file of HLM2, for example the
Mahalanobis distance measures, are not available in the residual files produced by HLM3. The
procedures for requesting level-3 residual files are similar to those for HLM2 as described in Section
2.5.4.
The files in this example are structured as SPSS data files and can be directly opened in SPSS. As
SS
with HLM2, the user can also specify STATA, SYSTAT or SAS command file format for the residual
file. The result will be STATA, SYSTAT or SAS data files. (For more details see Section 2.5.4.)
Alternatively, the data can be obtained in free form (i.e., as a text file) by selecting the Free Format
option on the Create Level-3 Residual File dialog box. These residual files can then be read into
any other computing package. The list of variables in the level-3 residual file and their attributes are
shown in Figure 4.8, while the first 10 records contained in this file are shown in Figure 4.10.
I
79
Co
py
Figure 4.8 List of variables and attributes for level-3 residual file
An example of the level-2 residual file produced in the above analysis is shown in Figure 4.9. Only
data from school 2020 are given.
rig
We see that the level-3 ID (l3id) is the first variable and the level-2 ID (l2id) is the second. The third
variable is njk, the number of observations associated with child j in school k. The empirical Bayes
estimates of the residuals, rp j k , are given next, including, respectively, the intercept (ebintrcpt1) and
the year effect (ebyear). The ordinary least squares estimates of the same quantities (olintrcpt1 and
olyear); and the fitted values, that is, the predicted values of the π p j k s for a given child based on the
ht
fixed effects (fvintrcpt1 and fvyear) and random school effect, follow. These are followed by the EB
coefficients. Finally, the posterior variances and covariances (pv2_0_0, pv2_1_0, and pv2_1_1) of the
empirical Bayes estimates are given.
©
SS
I
We see that the first child in the data set has schoolid 2020 and childid 273026452. That child has 3
time-series observations. The predicted growth rate for that child (the YEAR effect) is the fitted value
80
.953. That child's empirical Bayes residual YEAR effect is .004. Thus, the EB coefficient ("ebyear") is
computed as:
The level-3 residual file, printed below, has a similar structure. Only the data for the first 10 schools
are given. We see that the level-3 ID (l3id) is the first value given, and is followed by nk, the number
of children in school k. This is followed by the empirical Bayes estimates of the β s, including,
py
respectively, the intercept (eb00) and the year effect (eb10). The ordinary least squares estimates of
the same quantities (ol00 and ol10); and the fitted values, that is, the predicted values of the β s for a
given school based on that school's effect and the fixed effects (fv0_0 and fv1_0). The EB coefficients
are given next. Finally, the posterior variances and covariances (pv3_0_0_0_0, pv3_1_0_0_0, and
pv3_1_0_1_0) of the estimates are given.
rig
ht
©
We see that the first unit, school 2020, has nk = 21 children. The predicted YEAR effect for school
2020 is the fitted value .763, that is, the maximum-likelihood estimate of the school mean growth
rate in the case of this unconditional model. That school's empirical Bayes residual YEAR effect is
.190. Thus HLM3 constructs the empirical Bayes estimate of that school's YEAR effect (mean rate of
growth, "ec_10") as
I
β10* k = γ 100
*
+ u1*k = fv01 + eb10 (4.2)
=.763 + .190 = .953.
81
Similarly, HLM3 constructs the empirical Bayes estimate for the school's intercept, β 00k
*
("ec0_0"),
using fv0_0 + eb00.
Note that the empirical Bayes estimate of the school YEAR effect, 0.953, is the fitted value for each
child in that school (in the level-2 residual file). This will be true in any model that is unconditional
at level 2, that is, any model with no child-level predictors such as race, ethnicity or female. When
level-2 predictors are in the model, the level-2 fitted values will also depend on those predictors.
Co
4.4 Specification of a conditional model
The above example involves a model that is "unconditional" at levels 2 and 3; that is, no predictors
are specified at each of those levels. Such a model is useful for partitioning variation in intercepts
and growth rates into components that lie within and between schools (see Hierarchical Linear
Models, Chapter 8), but provides no information on how child or school characteristics relate to the
py
growth curves. Figure 4.11 shows a model that incorporates information about a child's race and
ethnicity and a school's percent low income. Moreover, we explore the possibility that several other
predictors (gender, school enrollment, and percent mobility) might help account for variation in
subsequent models.
rig
ht
©
SS
82
The maximum number of level-2 units = 1721
The maximum number of level-3 units = 60
The maximum number of iterations = 100
Method of estimation: full maximum likelihood
The outcome variable is MATH
Level-1 Model
Co
MATHijk = π0jk + π1jk*(YEARijk) + eijk
Level-2 Model
Level-3 Model
py
β00k = γ000 + γ001(LOWINCk) + u00k
β01k = γ010
β02k = γ020
β10k = γ100 + γ101(LOWINCk) + u10k
β11k = γ110
β12k = γ120
rig
Mixed Model
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
©
error d.f.
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3, γ000 0.187343 0.040175 4.663 7222 <0.001
LOWINC, γ001 -0.008941 0.000568 -15.733 7222 <0.001
For BLACK, β01
INTRCPT3, γ010 -0.405550 0.041045 -9.881 7222 <0.001
For HISPANIC, β02
SS
83
Least-squares estimates of fixed effects (with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3, γ000 0.187343 0.106837 1.754 7222 0.080
LOWINC, γ001 -0.008941 0.001287 -6.948 7222 <0.001
For BLACK, β01
INTRCPT3, γ010 -0.405550 0.106437 -3.810 7222 <0.001
Co
For HISPANIC, β02
INTRCPT3, γ020 -0.285918 0.089893 -3.181 7222 0.001
For YEAR slope, π1
For INTRCPT2, β10
INTRCPT3, γ100 0.906001 0.031606 28.665 7222 <0.001
LOWINC, γ101 -0.001768 0.000446 -3.968 7222 <0.001
For BLACK, β11
INTRCPT3, γ110 -0.015548 0.030859 -0.504 7222 0.614
py
For HISPANIC, β12
INTRCPT3, γ120 0.032732 0.037194 0.880 7222 0.379
Starting Values
σ2(0) = 0.29710
ht
τπ(0)
INTRCPT1,π0 0.69259 0.04914
YEAR,π1 0.04914 0.01481
τβ(0)
©
INTRCPT1 YEAR
INTRCPT2,β00 INTRCPT2,β10
0.05922 0.00290
0.00290 0.01057
σ2 = 0.30162
Standard error of σ2 = 0.00660
84
τπ
INTRCPT1,π0 0.62231 0.04657
YEAR,π1 0.04657 0.01106
Standard errors of τπ
INTRCPT1,π0 0.02451 0.00491
YEAR,π1 0.00491 0.00196
Co
τπ (as correlations)
INTRCPT1,π0 1.000 0.561
YEAR,π1 0.561 1.000
Standard errors of τβ
INTRCPT1 YEAR
INTRCPT2,β00 INTRCPT2,β10
0.01991 0.00441
ht
0.00441 0.00194
τβ (as correlations)
INTRCPT1/INTRCPT2,β00 1.000 0.033
YEAR/INTRCPT2,β10 0.033 1.000
©
Random level-2 coefficient Reliability estimate
INTRCPT1/INTRCPT2,β00 0.702
YEAR/INTRCPT2,β10 0.735
85
Final estimation of fixed effects:
Standard Approx. p-
Fixed Effect Coefficient t-ratio
error d.f. value
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3, γ000 0.140628 0.127486 1.103 58 0.275
LOWINC, γ001 -0.007578 0.001691 -4.482 58 <0.001
For BLACK, β01
INTRCPT3, γ010 -0.502091 0.077879 -6.447 1597 <0.001
Co
For HISPANIC, β02
INTRCPT3, γ020 -0.319381 0.086099 -3.709 1597 <0.001
For YEAR slope, π1
For INTRCPT2, β10
INTRCPT3, γ100 0.874501 0.039144 22.340 58 <0.001
LOWINC, γ101 -0.001369 0.000523 -2.619 58 0.011
For BLACK, β11
INTRCPT3, γ110 -0.030918 0.022453 -1.377 1597 0.169
py
For HISPANIC, β12
INTRCPT3, γ120 0.043085 0.024652 1.748 1597 0.081
Standard Approx. p-
Fixed Effect Coefficient t-ratio
rig
error d.f. value
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3, γ000 0.140628 0.113814 1.236 58 0.222
LOWINC, γ001 -0.007578 0.001396 -5.428 58 <0.001
For BLACK, β01
INTRCPT3, γ010 -0.502091 0.076842 -6.534 1597 <0.001
For HISPANIC, β02
ht
INTRCPT3, γ020 -0.319381 0.081918 -3.899 1597 <0.001
For YEAR slope, π1
For INTRCPT2, β10
INTRCPT3, γ100 0.874501 0.037287 23.453 58 <0.001
LOWINC, γ101 -0.001369 0.000499 -2.744 58 0.008
For BLACK, β11
INTRCPT3, γ110 -0.030918 0.022274 -1.388 1597 0.165
©
For HISPANIC, β12
INTRCPT3, γ120 0.043085 0.024368 1.768 1597 0.077
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
SS
Standard Variance
I
86
Statistics for the current model
Deviance = 16239.207347
Number of estimated parameters = 15
Exploratory Analysis: estimated level-2 coefficients and their standard errors obtained by
regressing EB residuals on level-2 predictors selected for possible inclusion in subsequent HLM
runs
Exploratory Analysis: estimated level-3 coefficients and their standard errors obtained by
py
regressing EB residuals on level-3 predictors selected for possible inclusion in subsequent HLM
runs
data (particularly with large ratios of level-1 to level-2 data) may find the 1st derivative Fisher
useful, although this will make the standard errors of σ 2 and the τ matrices more crude. If the third
option, No accelerator, is selected, there will be no Fisher iterations will be performed. This will
make large MDMs run faster, but will have the side effect of not producing standard errors of σ 2 and
the tau matrices. If you want to suppress any Fisher iterations, but do want to have the above
mentioned standard errors, choose 1st or 2nd derivative Fisher, and set the value in the Frequency of
I
88
5 Conceptual and Statistical Background for Four-Level
Co Models
HLM4 handles models with data that have a four-level nested structure. A four-level hierarchy would
arise in the HLM3 illustrative example described in the last chapter, for example, if the students who
were repeatedly observed while attending a given school were also nested within classrooms. With
an additional clustering unit of classrooms, the achievement data would be triply nested. The time-
series data are nested within students, the students nested within classrooms, and the classrooms
nested within schools. In a different scenario, with the incorporation of a measurement model for the
repeated measures on mathematics achievement for the same example, one would implement four-
py
level analyses. Hough, Bryk, Pinnell, Kerbow, Fountas, and Scharer (2008), for example, used this
approach with four-level models to study the effect of school-based coaching on the growth in
teacher expertise in literary practices. The level-1 model in their study was a measurement error
model associated with repeated measures on teacher expertise, the level-2 model studied the growth
trajectories of the "true scores" on the expertise, and the level-3 and level-4 models investigated the
rig
associations of the growth trajectory parameters with teacher- and school-level correlates,
respectively. For examples of similar level-1 measurement error models (in three-level analyses),
see pp. 248-249 in Chapter 8 and Chapter 11 of Hierarchical Linear Models.
Formally there are i = 1, ..., n jkl level-1 units (e.g., students), which are nested within each of j =
1,..., J kl level-2 units (e.g., classrooms) nested within each of k = 1,..., K l level-3 units (e.g.,
schools) nested within each of l = 1,…, L level-4 units (e.g., school districts).
SS
Yi j kl = π 0 j kl + π 1 j kl a1i j kl + π 2 j kl a2i j kl + + π p j kl a pi j kl + ei j kl
P (5.1)
= π 0 j kl + π p j kl a p ij kl + ei j kl ,
p =1
89
where
π p j kl = β p 0 kl + β p1kl X1 j kl + β p 2 kl X 2 j kl + + β pQ kl X Q
p p j kl + rp j kl
rig
Qp (5.2)
= β p 0 kl + β p q kl X q j kl + rp j kl ,
q =1
where
We assume that, for each level-2 unit, the vector of level-1 random effects (the rp q kl terms) is
©
distributed as multivariate normal, with each having a mean of zero and with covariance matrix Tπ ,
with a maximum dimension ( P + 1) × ( P + 1) .
Each of the level-2 coefficients, β p q kl , defined in the level-2 model, becomes an outcome variable in
the level-3 model:
Spq
= γ p q 0 + γ p q slWs kl + u p q kl ,
s =1
90
where
p p
(Q + 1) × (Q p + 1) ,
py
p (5.4)
p =0 p =0
γ p qsl = δ p qs 0 + δ p q s1Z1l + δ p q s 2 Z 2l + + δ p qG Z G
p qs p qs l
+ υ p q sl
ht
G p qs (5.5)
= δ p q s 0 + δ p q sg Z g l + υ p q sl ,
g =1
where
pq pq
(S pq + 1) × (S pq + 1) , (5.6)
I
pq = 0 pq = 0
As in the case of the three-level program, the three-level program routinely prints standard errors
and t-tests for each of the level-3 coefficients ("the fixed effects") as well as a chi-square test of
homogeneity for each random effect. In addition, optional "multivariate hypothesis tests" and
residual files are available in the four-level program.
py
rig
ht
©
SS
I
92
6 Working with HLM4
Data analysis by means of the HLM4 program involves similar stages regarding MDM creation,
Co
analyses, and fit evaluation as in the case of the two- and three-level programs. HLM4 analyses can
be executed in Windows, interactive, and batch modes. We describe a Windows execution below.
We consider interactive and batch execution in Appendix D.
The example illustrates the use of a level-1 in HLM as a measurement model. In brief,
ht
Ymtij = ψ 0tij + ε mtij , ε mtij ~ N ( 0, σ mtij
2
)
where
©
Ymtij is the observed measure on occasion t for teacher i in school j,
ψ tij is the true or latent value for teacher expertise, and
ε mtij is the error of measurement associated with the observed rating m on occasion t for
teacher i in school j.
SS
(Note, in this data set there is only one observed rating per occasion. As a result the number of level-
1 and level-2 units are identical.)
In most applications, ε mtij is unknown and assumed normally distributed with constant variance. In
contrast in this application, the Rasch measurement model for the observed outcomes, Ymtij , also
I
provides a standard error estimate for each observed measure, smtij . We explicitly represent this by
−1
multiplying both sides of the level-1 model by the inverse of the standard error, amtij = smtij , yielding
*
Ymtij = amtijψ 0tij + emtij
* *
, emtij ~ N ( 0,1) .
93
The variance at level-1 is now assumed known and fixed at a value of 1.0.
Data input requires a level-1 file (in our illustration a measurement data file), a level-2 file ("true
scores" file), a level-3 (teacher level), and a level-4 (school level) file.
Level-1 file. The level-1 file, MEASURE.SAV, has 1317 observations collected on 219 teachers on up
py
to 9 different occasions. Data for the first three teachers are shown in Fig. 6.1. Each of these teachers
was observed on three occasions. (Some teachers in the study were observed on as many as nine
occasions over three years.)
The first column contains the level-4 (i.e., school) ID, next is the level-3 (i.e., teacher) ID, and this is
followed by the level-2 (i.e., occasion) ID. We see that the first record comes from school 1100,
rig
teacher 1100002, and occasion 11000026. Following the teacher ID fields are that teacher's values on
two variables:
• expertis
A composite Rasch measure of teachers' classroom literacy practice rated on some
particular occasion (weighted by the inverse of its standard error of measurement.)
ht
• invstder
The inverse of the standard error of measurement associated with that individual rating
(the standard errors are generated as part of the Rasch rating scale model.)
©
SS
Level-2 file. The level-2 units consisted of the 1317 occasions when measurements on classroom
literary practice were made. The data are stored in the file OCCAS.SAV. The level-2 data for the first
nine records are listed below. It has the same three ID's as the level-1 file. The two occasion-level
variables are included in the file:
94
• occasion
This variable identifies the specific data collection time point, counted up from the first
study occasion in the fall of year1 (a value of 0) through the end of the study in the
spring of year 3 (a value of 8).
• artifact
A dummy variable introduced into the analysis to adjust for a measurement artifact that
occurred with the first-year spring scores (at occasion = 2).
Co
py
rig
Figure 6.2 First nine cases in OCCAS.SAV
The first teacher in this data file, Teacher 1100002 in school 1100, was observed on three occasions
during the second year of the study (i.e. occasions 3 through 5). The same was true for the next two
teachers. In general, the data collection patterns vary among teachers in this study depending upon
their employment history at the school and when they first became eligible for classroom coaching.
ht
Level-3 file. The level-3 units are the 219 teachers. The data are stored in the TCHR.SAV file. The
first field is the school ID and the second is the teacher ID. Note that each of the first ten teachers is in
school 1100. There are six variables in this file:
• coach
©
The average number of one-on-one coaching sessions per month that each teacher
received over the course of the study
• newwtch
A dummy variable indicating that the teacher had three or fewer years of classroom
teaching experience at onset of study participation
•
SS
pdpart
A composite measure of teachers' exposure to literacy professional development prior to
the onset of the study
• scmt
A scale score on the teacher's commitment to the school measured at study onset
• y2ent
I
A dummy variable indicating the teacher began work at the school during the second year
of the study
95
• y3ent
A dummy variable indicating the teacher began work at the school during the third year of
the study
Co
py
Figure 6.3 First ten teachers in TCHR.SAV
Level-4 file. The school level data from 17 schools appear in SCH.SAV. The first field is the school
ID. This is followed by:
• chgcoach
rig
A dummy variable indicating that a coaching change occurred during the course of the
study. This happened with only one school in the sample.
ht
©
The response file, LITERACY.MDMT, contains a log of the input responses used to create the MDM
file, LITERACY.MDM, using MEASURE.SAV, OCCAS.SAV, TCHR.SAV, and SCH.SAV. Figure 6.5 shows
the dialog box used to create the MDM file. Note that the model notation selected is longitudinal with
measurement model data. Choosing this option affects the notation used for subscripts and model
parameters in the Windows interface and program output.
I
96
Co
py
rig
Figure 6.5 Make MDM – HLM4 dialog box for LITERACY.MDM
both their initial status and growth rate on the expertise measure over time. We also include as a
fixed effect in the level-2 model for the measurement artifact that occurred at the third time
point, ARTIFACT.
3. The "true score" level-2 outcomes are specified as randomly varying between teachers.
4. Specification of the level-3 prediction model. In general, one may select different level-3
predictors for each level-3 equation. In the example below, we illustrate this with four of the
I
8. Finally, to specify the level-1 variance as fixed at a value of 1.0, per the measurement model
described above, open the Other Settings menu, select Estimation Settings, enter 1.0 in the text
rig
box for Fix Sigma^2 to specific value.
98
Co
py
rig
Figure 6.7 Model window for the conditional model for the literacy program example
ht
6.3 An annotated example of HLM4 output
Level-1 Model
I
Level-2 Model
ψ1tij = π10ij + π11ij*(OCCASIONtij) + π12ij*(ARTIFACTtij) + e1tij
99
Level-3 Model
π10ij = β100j + β101j*(NEWTCHRij) + β102j*(PDPARTij) + β103j*(SCMTij) + r10ij
π11ij = β110j + β111j*(COACHij) + β112j*(NEWTCHRij) + β113j*(PDPARTij) + β114j*(SCMTij) + r11ij
π12ij = β120j
Level-4 Model
β100j = γ1000 + u100j
β101j = γ1010
β102j = γ1020
β103j = γ1030
Co
β110j = γ1100 + u110j
β111j = γ1110
β112j = γ1120
β113j = γ1130
β114j = γ1140
β120j = γ1200 + u120j
COACH NEWTCHR PDPART SCMT have been centered around the level-4 mean.
py
For starting values, data from 1317 level-1, 1312 level-2, 214 level-3 and 17 level-4 records were used
σ2e
rig
INVSTDER,ψ1 0.31788
τπ
INVSTDER INVSTDER
INTRCPT2,π10 OCCASION,π11
©
0.93753 0.01861
0.01861 0.00113
τπ (as correlations)
INVSTDER/INTRCPT2,π10 1.000 0.571
INVSTDER/OCCASION,π11 0.571 1.000
SS
Note: The reliability estimates reported above are based on only 214 of 219 units that had sufficient data
for computation. Fixed effects and variance components are based on all the data.
I
Note, among teachers within schools, there is a positive correlation of 0.571 between their initial
status and expertise development.
100
τβ
INVSTDER INVSTDER INVSTDER
INTRCPT2 OCCASION ARTIFACT
INTRCPT3,β100 INTRCPT3,β110 INTRCPT3,β120
0.28840 -0.03214 0.16341
-0.03214 0.03798 -0.05972
0.16341 -0.05972 0.22678
Co
τβ (as correlations)
INVSTDER/INTRCPT2/INTRCPT3,β100 1.000 -0.307 0.639
INVSTDER/OCCASION/INTRCPT3,β110 -0.307 1.000 -0.643
INVSTDER/ARTIFACT/INTRCPT3,β120 0.639 -0.643 1.000
In contrast, at the school level a negative correlation, -.307, exists between school mean initial status
on teachers' expertise and school-level growth rates.
py
Random level-3 coefficient Reliability estimate
INVSTDER/INTRCPT2/INTRCPT3 0.727
INVSTDER/OCCASION/INTRCPT3 0.965
INVSTDER/ARTIFACT/INTRCPT3 0.747
rig
The value of the log-likelihood function at iteration 61 = -3.447675E+003
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INVSTDER, ψ1
ht
For INTRCPT2, π1 0
For INTRCPT3, β1 0 0
INTRCPT4, γ1 0 0 0 -0.042320 0.152308 -0.278 32 0.783
For NEWTCHR, β1 0 1
INTRCPT4, γ1 0 1 0 -0.520219 0.226444 -2.297 178 0.022
For PDPART, β1 0 2
INTRCPT4, γ1 0 2 0 0.167179 0.092189 1.813 178 0.069
©
For SCMT, β1 0 3
INTRCPT4, γ1 0 3 0 0.137797 0.085591 1.610 178 0.107
For OCCASION, π1 1
For INTRCPT3, β1 1 0
INTRCPT4, γ1 1 0 0 0.208296 0.048144 4.327 32 <0.001
For COACH, β1 1 1
INTRCPT4, γ1 1 1 0 0.261937 0.078204 3.349 178 0.001
For NEWTCHR, β1 1 2
SS
New teachers scored considerably lower on initial status than more experienced teachers ( γ 1010 =
–0.520, t = –2.297, p-value = 0.022.) As hypothesized by the study, both prior professional
development experience PDPART and commitment to school improvement SCMT were positively
101
related to differences among schools in initial expertise ratings ( p-values of 0.069 and 0.107
respectively.)
In terms of teachers' growth in expertise over the course of the study, OCCASION, the study
hypothesized that this would be related to differential exposure to coaching, COACH.
A highly significant relationship was found, ( γ 1110 = 0.262, with associated t-value of 3.349 and a
p-value = 0.001). A significant measurement artifact also occurred, see results for γ 1200 .
Co
Final estimation of level-1 and level-2 variance components
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INVSTDER, e1 0.56381 0.31788 1078 4729.76970 <0.001
py
Note: The chi-square statistics reported above are based on only 1312 of 1317 units that had sufficient
data for computation. Fixed effects and variance components are based on all the data.
Note: The chi-square statistics reported above are based on only 214 of 219 units that had sufficient data
for computation. Fixed effects and variance components are based on all the data.
ht
The variation on among teachers within schools on expertise ratings at the study onset, var( r10), is
0.937 and the variation within schools on teachers' rate of growth in expertise, var (r11), is 0.001.
Both variance components are statistically significant.
We see evidence of considerable variability among schools in teachers' initial expertise ratings, u110,
( χ 2 = 65.906, p − value < 0.001) . Significant variation was also found in school growth rates, u110 ,
and in the magnitude of the measurement artifact at each school, u120.
I
102
Statistics for the current model
Deviance = 6895.349602
Number of estimated parameters = 20
103
7 Conceptual and Statistical Background for Hierarchical
Generalized Linear Models (HGLM)
Co
The hierarchical linear model (HLM) as described in the previous six chapters is appropriate for two-
and three-level data where the random effects at each level are normally distributed. The assumption
of normality at level-1 is quite widely applicable when the outcome variable is continuous. Even
when a continuous outcome is highly skewed, a transformation can often be found that will make the
distribution of level-1 random effects (residuals) at least roughly normal. Methods for assessing the
normality of random effects at higher levels are discussed on page 38 and on page 274 of
Hierarchical Linear Models.
py
There are important cases, however, where the assumption of normality at level-1 is clearly not
realistic and no transformation can make it so. Examples of a binary outcome, Y, are: the presence of
a disease (Y = 1 if the disease is present, Y = 0 if the disease is absent), graduation from high school
rig
(Y = 1 if a student graduates on time, Y = 0 if not), or the commission of a crime (Y = 1 if a person
commits a crime during a given time interval, Y = 0 if not). The use of the standard level-1 model in
this case would be inappropriate for three reasons:
• Given the predicted value of the outcome, the level-1 random effect can take on only one of
two values, and therefore cannot be normally distributed.
ht
• The level-1 random effect cannot have homogeneous variance. Instead, the variance of this
random effect depends on the predicted value as specified below.
• Finally, there are no restrictions on the predicted values of the level-1 outcome in the
standard model: they can legitimately take on any real value. In contrast, the predicted value
of a binary outcome Y, if viewed as the predicted probability that Y = 1, cannot meaningfully
©
be less than zero or greater than unity. Thus, an appropriate model for predicting Y ought to
constrain the predicted values to lie in the interval (0, 1). Without this constraint the effect
sizes estimated by the model are, in general, uninterpretable.
Another example involves count data, where Y is the number of crimes a person commits during a
SS
year or Y is the number of questions a child asks during the course of a one-hour class period. In
these cases, the possible values of Y are non-negative integers 0, 1, 2, .... Such data will typically be
positively skewed. If there are very few zeros in the data, a transformation, e.g., Y * = log(1 + Y ) ,
may solve this problem and allow sensible use of the standard HLM. However, in the cases
mentioned above, there will typically be many zeros (many persons will not commit a crime during
a given year and many children will not raise a question during a one-hour class). When there are
I
many zeros, the normality assumption cannot be approximated by a transformation. Also, as in the
case of the binary outcome, the variance of the level-1 random effects will depend on the predicted
value (higher predicted values will have larger variance). Similarly, the predicted values ought to be
constrained to be positive.
104
Another example involves multi-category (≥ 2) data, where the outcome comsists of responses
tapping teachers' commitment to their career choice. Teachers are asked if they would choose the
teaching profession if they could go back to college and start over again. The three response
categories are:
The level-1 model in the HGLM may be viewed as consisting of three parts: a sampling model, a link
function, and a structural model. In fact, the standard HLM can be viewed as a special case of the
HGLM where the sampling model is normal and the link function is the identity link.
ht
7.1.1 Level-1 sampling model
The sampling model for a two-level HLM might be written as
©
Y i j | μ i j ~ NID ( μ i j ,σ )
2
(7.1)
meaning that the level-one outcome Y i j , given the predicted value, μ i j , is normally and
independently distributed with an expected value of μ i j and a constant variance, σ 2 . The level-1
expected value and variance may alternatively be written as
SS
In general it is possible to transform the level-1 predicted value, μ i j , to η i j to insure that the
predictions are constrained to lie within a given interval. Such a transformation is called a link
function. In the normal case, no transformation is necessary. However, this decision not to transform
may be made explicit by writing
105
ηi j = μi j . (7.3)
The link function in this case is viewed as the "identity link function."
η i j = β 0 j + β 1j X 1i j + β 2 j X 2i j + + β Q j X Qi j . (7.4)
It is clear that combining the level-1 sampling model (7.1), the level-1 link function (7.3), and the
level-1 structural model (7.4) reproduces the level-1 model of HLM (1.1). In the context of a standard
py
HLM, it seems silly to write three equations where only one is needed, but the value of the extra
equations becomes apparent in the case of binary, count, and multi-categorical data.
7.2 Two-, three-, and four- level models for binary outcomes
rig
While the standard HLM uses a normal sampling model and an identity link function, the binary
outcome model uses a binomial sampling model and a logit link. Only the level-1 models differ from
the linear case.
Yi j | φi j ~ B ( mi j , φi j ) , (7.5)
©
to denote that Yi j has a binomial distribution with mi j trials and probability of success φi j .
According to the binomial distribution, the expected value and variance of Yi j are then
When mi j = 1, Yi j may take on values of either zero or unity. This is a special case of the binomial
distribution known as the Bernoulli distribution. HGLM allows estimation of models in which mi j = 1
(Bernoulli case) or mi j > 1 (other binomial cases). The case with mi j >1 will be treated later.
I
For the Bernoulli case, the predicted value of the binary Yi j is equal to the probability of a success,
φi j .
106
7.2.2 Level-1 link function
When the level-1 sampling model is binomial, HGLM uses the logit link function
φi j
ηi j = log
1 − φi j
. (7.7)
Co
In words, ηi j is the log of the odds of success. Thus if the probability of success, φi j , is 0.5, the odds
of success is 1.0 and the log-odds or "logit" is zero. When the probability of success is less than 0.5,
the odds are less than one and the logit is negative; when the probability is greater than 0.5, the odds
are greater than unity and the logit is positive. Thus, while φi j is constrained to be in the interval
(0,1) , ηi j can take on any real value.
py
7.2.3 Level-1 structural model
This will have exactly the same form as (7.4). Note that estimates of the β s in (7.4) make it possible
to generate a predicted log-odds (ηi j ) for any case. Such a predicted log-odds can be converted to an
rig
odds by computing odds = exponential (ηi j ). Similarly, predicted log-odds can be converted to a
predicted probability by computing
1
φi j = . (7.8)
1 + exp ( −ηi j )
ht
Clearly, whatever the value of ηi j , applying (7.8) will produce a φi j between zero and unity.
For count data, we use a Poisson sampling model and a log link function.
Let Yi j be the number of events occurring during an interval of time having length mi j . For example,
Yi j could be the number of crimes a person i from group j commits during five years, so that mi j =
5. The time-interval of mi j units may be termed the "exposure." Then we write that
107
Yi j | λi j ~ P(mi j , λi j ) (7.9)
to denote that Yi j has a Poisson distribution with exposure mi j and event rate λi j . According to the
Poisson distribution, the expected value and variance of Yi j are then
Co
E (Yi j | λi j ) = mi j λi j Var (Yi j | λi j ) = mi j λi j . (7.10)
The exposure mi j need not be a measure of time. For example, if Yi j is the number of bombs
dropping on neighborhood i of city j during a war, mi j could be the area of that neighborhood. A
common case arises when, for each i and j, the exposure is the same (e.g., Yi j is the number of
py
crimes committed during one year for each person i within each neighborhood j). In this case, we set
mi j = 1 for simplicity. HGLM allows estimation of models in which mi j = 1 or mi j ≥ 1. (The case
with mi j ≥ 1 will be treated later.)
rig
According to our level-1 model, the predicted value of Yi j when mi j = 1 will be the event rate λi j .
In words, ηi j is the log of the event rate. Thus, if the event rate, λi j , is one, the log is zero. When the
©
event rate is less than one, the log is negative; when the event rate is greater than one, the log is
positive. Thus, while λi j is constrained to be non-negative, ηi j can take on any real value.
This will have exactly the same form as (7.4). Note that estimates of the β s in (7.4) make it possible
to generate a predicted log-event rate (ηi j ) for any case. Such a predicted log-event rate can be
converted to an event rate by computing
108
7.3.4 Level-2 model
The level-2 model has the same form as the level-2 model for HLM2 (equations 1.2, 1.3, and 1.4),
and the level-2 and level-3 models have the same form in the three- and four-level case as in HLM3
and HLM4, respectively.
Prob ( Ri j = m ) = φi j ,
that is, the probability that person i in group j lands in category m is φi j , for categories m = 1, ..., M,
rig
there being M possible categories.
For example, Ri j = 1 if high school student i in school j goes on to college; Ri j = 2 if that student
goes on to a job; Ri j = 3 if that student becomes unemployed. Here M = 3. The analysis is facilitated
ht
by constructing dummy variables Y1 , Y2 , , YM , where Ymi j = 1 if Ri j = m, 0 otherwise. For example,
if student ij goes to college, Ri j = 1, so Y1i j = 1, Y2 i j = 0, Y3i j = 0; if student ij goes to work, Ri j = 2,
so Y1i j = 0, Y2i j = 1, Y3i j = 0; if that student becomes unemployed, Ri j = 3, so Y1i j = 0, Y2i j = 0, Y3i j
= 1. This leads to a definition of the probabilities as Prob (Ymi j = 1) = φmi j . For example, for M = 3,
©
According to the multinomial distribution, the expected value and variance of Ymi j given φmi j , are
I
then
109
The covariance between outcomes Ymi j and Ym′i j is
φmi j
ηmi j = log (7.15)
φM i j
py
where
M −1
φM i j = 1 − φmi j . (7.16)
m =1
rig
In words, η mi j is the log odds of being in m-th category relative to the M-th category, which is
known as the "reference category."
β q j ( m ) = γ q 0( m ) + γ q s ( m )Ws j + uq j ( m ) . (7.18)
s =1
110
7.5 The model for ordinal data
The probabilities Prob (Ymi j = 1) are thus cumulative probabilities. For example, with M = 3,
rig
Prob (Y1i j = 1) = Prob ( R i j = 1) = φ1i j
Prob (Y2 i j = 1) = Prob ( R i j = 1) + Prob ( R i j = 2 ) = φ2 i j (7.21)
ht
Prob (Y3i j = 1) = Prob ( R i j = 1) + Prob ( R i j = 2 ) + Prob ( R i j = 3) = 1
Prob ( R i j ≤ m ) φ
η mi j = log =log mi j . (7.22)
Prob ( R i j > m ) 1 = φmi j
SS
Q M
η mi j = β 0 j + β q j X qi j + δ m . (7.23)
q =1 m=2
111
Under the proportional odds assumption, the relative odds that R i j ≤ m , associated with a unit
increase in the predictor, does not depend on m.
Q
η1i j = β 0 j + β q j X qi j
Co
q =1
Q
η 2 i j = β 0 j + β q j X qi j + δ 2 (7.24)
q =1
Q
η3i j = β 0 j + β q j X qi j + δ 2 + δ 3
q =1
py
7.6 Parameter estimation
HLM2 and HLM3 use three approaches to estimation for HGLM. The first method bases inference on
the joint posterior modes of the level-1 and level-2 (and level-3) regression coefficients given the
variance-covariance estimates. The variance-covariance estimates are based on a normal
rig
approximation to the restricted likelihood. Stiratelli, Laird, & Ware (1984) and Wong & Mason
(1985) developed this approach for the binary case. Schall (1991) discusses the extension of this
approach to the wider class of generalized linear models. Breslow & Clayton (1993) refer to this
estimation approach as "penalized quasi-likelihood" or PQL. Extending HLM to HGLM requires a
doubly iterative algorithm, significantly increasing computational time. Related approaches are
described by Goldstein (1991), Longford (1993), and Hedeker & Gibbons (1994).
ht
The second and third methods of estimation ("Laplace and “adaptive Gaussian quadrature") involve
somewhat more computationally intensive algorithms but provide accurate approximation to
maximum likelihood (ML). These two approaches are currently available for two-level and three-
level Bernoulli models and for two-level Poisson models with mij = 1 . We consider PQL below in
©
some detail followed by a brief discussion of Laplace and adaptive Gaussian quadrature.
of a standard HLM model with the introduction of special weighting at level-1. However, after this
standard HLM analysis has converged, the linearized dependent variable and the weights must be
recomputed. Then, the standard HLM analysis is re-computed. This iterative process of analyses and
recomputing weights and linearized dependent variable continues until estimates converge.
I
We term the standard HLM iterations "micro-iterations." The recomputation of the linearized
dependent variable and the weights constitute a "macro iteration." The approach is outlined below
for four cases: Bernoulli (binomial with mi j = 1 ), Poisson with mi j = 1 , binomial with mi j > 1 , and
Poisson with mi j > 1 .
112
7.6.1.1 Bernoulli (binomial with mi j = 1 )
Yi j = φi j + ε i j (7.25)
∂φi(ji )
py
φi j ≈ φ (0)
ij +
∂η (i ) (η ij j )
− ηi(0) (7.27)
ij
φi(0)
η = log
1 − φi j
(0) j
ij (0)
, (7.28)
rig
where φi(0)
j is an initial estimate and
∂φi j
= wi j = φi j (1 − φi j ) . (7.29)
∂ηi j
ht
If we evaluate wi j at its initial estimates
wi j = φ i j (1 - φ i j ) .
(0) (0) (0)
(7.30)
©
j + wi j (ηi j − ηi j ) + ε i j .
Yi j = φi(0) (0) (0)
(7.31)
SS
I
113
Algebraically rearranging the equation so that all observables are on the left-hand side yields
εi j
j = ηi j +
Z i(0)
wi(0)
j (7.32)
= β 0 j + β1 j X 1i j + β 2 j X 2 i j + + βQ j X Qi j + ei j ,
Co
where
Yi j − φi(0)
= + ηi(0)
(0) j
Z ij j (7.33)
wi(0)
j
py
is the linearized dependent variable and
εi j 1
Var (ei j ) = Var (0) ≈ (0) .
rig
(7.34)
wi j wi j
Thus, (7.32) is a standard HLM level-1 model with outcome Z i(0) (0)
j and level-1 weighting variable wi j .
ht
The algorithm works as follows.
1. Given initial estimates of the predicted value, φi j , and therefore of the linearized
dependent variable, Z i j , and the weight, wi j , compute a weighted HLM analysis with
(7.32) as the level-1 model.
2. The HLM analysis from step 1 will produce new predicted values and thus new linearized
©
dependent variables and weights. HLM will now compute a new, re-weighted MDM file
with the appropriate linearized dependent variable and weights.
3. Based on the new linearized dependent variable and weights, re-compute step 1.
This process goes on until the linearized dependent variable, the weights, and therefore, the
parameter estimates, converge to a pre-specified tolerance. The program then stops.
SS
The procedure is exactly the same as in the binomial case with mij = 1 except that
∂λi j
Var (ε i j ) = wi j = = λi j .
I
(7.35)
∂ηi j
114
7.6.1.3 Binomial with mi j > 1
In the previous example, Yi j was formally the number of successes in one trial and therefore could
take on a value of 0 or 1. We now consider the case where Yi j is the number of successes in mi j
trials, where Yi j and mi j are non-negative integers, Yi j ≤ mi j .
Co
Suppose that a researcher is interested in examining the relationship between pre-school experience
(yes or no) and grade retention and wonders whether this relationship is similar for males and
females. The design involves students at level 1 nested within schools at level 2. In this case, each
school would have four "cell counts" (boys with and without pre-school and girls with and without
pre-school). Thus, the data could be organized so that every school had four observations (except
possibly schools without variation on pre-school or sex), where each observation was a cell having a
cell size mi j and a cell count Yi j of students in that cell who were, in fact, retained. One could then
py
re-conceptualize the study as having up to four level-1 units (cells); the outcome Yi j , given the cell
probability φi j , would be distributed as B ( mi j , φi j ) . There would be three level-1 predictors (a
contrast for pre-school experience, a contrast for sex, and an interaction contrast). This problem then
has the structure of a 2 × 2 × J contingency table (pre-school experience by sex by school) with the
rig
last factor viewed as random.
For example, n12 is the number of girls in school 2 with pre-school and Y12 is the number of those
SS
girls who were retained. The predictor X 1i j is a contrast coefficient to assess the effect of sex (0.5 if
female, –0.5 if male); X 2 i j is a contrast for pre-school experience (0.5 if yes, –0.5 if no), and
X 3i j = X 1i j × X 2 i j is the interaction contrast.
Estimation works the same in this case as in the binomial case except that
I
Yi j − mi jφi j
Zi j = + ηi j (7.36)
wi j
115
with
wi j = mi jφi j (1 − φi j ). (7.37)
and
wi j = mi j λi j . (7.39)
ht
7.6.2 Properties of the estimators
Using PQL, HGLM produces approximate empirical Bayes estimates of the randomly varying level-1
coefficients, generalized least squares estimators of the level-2 (and level-3 or level-4) coefficients,
©
and approximate maximum-likelihood estimators of the variance and covariance parameters. Yang
(1995) has conducted a simulation study of these estimators in comparison with an alternative
approach used by some programs that sets the level-2 random coefficients to zero in computing the
linearized dependent variables. Breslow & Clayton (1993) refer to this alternative approach as
"marginalized quasi-likelihood" or MQL. Rodriquez & Goldman (1995) had found that MQL
produced biased estimates of the level-2 variance and the level-2 regression coefficients. Yang's
SS
results showed a substantial improvement (reduction in bias and mean squared error) in using the
approach of HGLM. In particular, the bias in estimation of the level-2 coefficients was never more
than 10 percent for HGLM, while the MQL approach commonly produced a bias between 10 and 20
percent. HGLM performed better than the alternative approach in estimating a level-2 variance
component as well. However, a negative bias was found in estimating this variance component,
ranging between two percent and 21 percent. The bias was most severe when the true variance was
I
very large and the typical "probability of success" was very small (or, equivalently, very large).
Initial simulation results under the Poisson model appear somewhat more favorable than this.
Breslow & Clayton (1993) suggest that the estimation will be more efficient as the level-1 sample
size increases.
116
7.6.3 Parameter estimation: A high-order Laplace and adaptive Gaussian
Quadrature approximation of maximum likelihood
For two- and three-level models with binary and count outcomes, HGLM provides two alternatives to
estimation via PQL: a high-order Laplace and an adaptive Gaussian Quadrature approximation.
Figure 7.1 displays the dialog box for the estimation settings for two-level models.
Co
py
rig
ht
©
Figure 7.1 Estimation settings for two-level hierarchical generalized linear models
One alternative for two- and three-level Bernoulli and Poisson models with constant and variable
exposure uses a high-order approximation to the likelihood based on a Laplace transform. The
adaptive Gauss-Hermite quadrature (AGQ) technique (Pinheiro & Bates, 1995) is another
approximation option available for two- and three-level binomial and Poisson models with constant
SS
and variable exposure. For AGQ, users have the options to specify the number of quadrature points
and to choose the use of a first or a second derivative approximation. Both accuracy in
approximation and computational demands increase as the number of nodes specified increases and
when the second derivative option is used.
For two-level Bernoulli models, Yang (1998), Raudenbush, Yang, and Yosef (2000) and Yosef
I
(2001) found that both the Laplace and AGQ techniques yielded accurate estimates. Results of Yosef
(2001) suggested AGQ performed better for models with small cluster size (nij = 2) in terms of
smaller means-squared errors and biases. The Laplace method, on the other hand, gave more
accurate approximation in models with bivariate random effects. Johnson (2006) showed in his
simulation study that for two-level Poisson models with equal exposure, the Laplace and AGQ
117
estimates in general displayed less bias than those of PQL. However, AGQ gave more accurate
approximation when the event rate was low and the level-2 variance was large (τ00 = 1). Based on
his results, he recommended AGQ be used with small event rate and small cluster size (nij = 2).
The models described above have been termed "unit-specific" models. They model the expected
outcome for a level-2 unit conditional on a given set of random effects. For example, in the
Co
Bernoulli case ( mi j = 1 ), we might have a level-1 (within-school) model
η i j = β0 j + βq j X i j , (7.40)
ηi j = γ 00 + γ 01W j + γ 10 X i j + u0 j . (7.42)
ht
Under this model, the predicted probability for case ij, given u0 j , would be
E (Yi j | u0 j ) =
1
. (7.43)
{ }
1 + exp − ( γ 00 + γ 01W j + γ 10 X i j + u0 j )
©
In this model γ 10 is the expected difference in the log-odds of "success" between two students who
attend the same school but differ by one unit on X (holding u0 j constant); γ 01 is the expected
difference in the log-odds of success between two students who have the same value on W but
attend schools differing by one unit on W (holding u0 j constant). These definitions parallel
SS
However, one might also want to know the average difference between log-odds of success of
students having the same X but attending schools differing by one unit on W, that is, the difference
of interest averaging over all possible values of u0 j . In this case, the unit-specific model would not
I
be appropriate. The model that would be appropriate would be a "population-average" model (Zeger,
Liang, & Albert, 1988). The distinction is tricky in part because it does not arise in the standard HLM
(with an identity link function). It arises only in the case of a non-linear link function.
118
Using the same example as above, the population average model would be
E (Yi j ) =
1
. (7.44)
{
1 + exp − ( γ *
00 + γ 01
*
}
W j + γ 10* X i j )
Notice that (7.41) does not condition on (or "hold constant") the random effect u0 j . Thus, γ 01
*
gives
Co
the expected difference in log-odds of success between two students with the same X who attend
schools differing by one unit on W – without respect to the random effect, u0 j . If one had a
nationally representative sample and could validly assign a causal inference to W, γ 01
*
would be the
change in the log-odds of success in the whole society associated with boosting W by one unit while
γ 01 would be the change in log-odds associated with boosting W one unit for those schools sharing
py
the same value of u0 j .
HGLM produces estimates for both the unit-specific and population-average models. The population-
average results are based on generalized least squares given the variance-covariance estimates from
the unit-specific model. Moreover, HGLM produces robust standard error estimates for the
rig
population-average model (Zeger, et al., 1988). These standard errors are relatively insensitive to
misspecification of the variances and covariances at the two levels and to the distributional
assumptions at each level. The method of estimation used in HGLM for the population-average model
is equivalent to the "generalized estimating equation" (GEE) approach popularized by Zeger, et al.
(1988).
ht
The following differences between unit-specific and population-average results are to be expected:
• If all predictors are held constant at their means, and if their means are zero, the population-
average intercept can be used to estimate the average probability of success across the entire
©
population, that is
1
φ i j = . (7.45)
1 + exp(−γ 00
*
)
SS
This will not be true of unit-specific intercepts unless the average probability of
success is very close to .5.
• Coefficient estimates (other than the intercept) based on the population-average model will
often tend to be similar to those based on the unit-specific model but will tend to be smaller
in absolute value.
I
Users will need to take care in choosing unit-specific versus population-average results for their
research. The choice will depend on the specific research questions that are of interest. In the
previous example, if one were primarily interested in how a change in W can be expected to affect a
particular individual school's mean, one would use the unit-specific model. If one were interested in
119
how a change in W can be expected to affect the overall population mean, one would use the
population-average model.
As mentioned earlier, if the data follow the assumed level-1 sampling model, the level-1 variance of
the Yi j will be wi j where
Co
wi j = mi jφi j (1 − φi j ), Binomial case, or
(7.46)
wi j = mi j λi j , Poisson case.
However, if the level-1 data do not follow this model, the actual level-1 variance may be larger than
py
that assumed (over-dispersion) or smaller than that assumed (under-dispersion). For example, if
undetected clustering exists within level-1 units or if the level-1 model is under-specified, extra-
binomial or extra-Poisson dispersion may arise. This problem can be handled in a variety of ways;
HGLM allows estimation of a scalar variance so that the level-1 variance will be σ wi j .
2
rig
7.9 Restricted versus full PQL versus full ML
The default method of estimation for HGLM is restricted PQL, while full PQL is an option. For the
three-and four-level HGLM, PQL estimation is by means of full PQL only. All estimates based on
Laplace and adaptive Gauss-Hermite Quadratures are based on full ML.
ht
7.10 Hypothesis testing
The logic of hypothesis testing with HGLM is quite similar to that used in the case of HLM. Thus, for
the fixed effects (the γ s), a table of approximate t-values is routinely printed for univariate tests;
multivariate tests for the fixed effects are available using the approach described earlier in
©
Chapter 2. Similarly, univariate tests for variance components (approximate chi-squares) are also
routinely printed out. The one exception is that multivariate tests based on comparing model
deviances ( −2 log likelihood at convergence ) are not available using PQL, because PQL is based on
quasi-likelihood rather than maximum-likelihood estimation. These are available using Laplace or
adaptive Gauss-Hermite quadrature.
SS
I
120
8 Fitting HGLMs (Nonlinear Models)
There is no difference between HGLM ("nonlinear analysis") and HLM ("linear analysis") in the
Co
construction of the MDM file. Thus, the same MDM file can be used for nonlinear and linear analysis.
Model specification for nonlinear analyses, as in the case of linear analyses, can be achieved via
Windows (PC implementation only), interactive execution, or batch execution. The mechanics of
model specification are generally the same as in linear analyses with the following differences:
py
• Six types of nonlinear analysis are available. With Windows execution, these options are
displayed in the Basic Model Specifications – HLM2 dialog box (See Figure 8.1). This
dialog box is accessed by clicking the Outcome button at the top of the variable listbox to
the left of the main HLM window. There are two choices for dichotomous outcomes, two for
rig
count outcomes, one for multinomial outcomes, and one for ordinal outcomes.
• Highly accurate approximations to maximum likelihood based on either the Laplace
approximation or adaptive Gauss-Hermite Quadrature are available for 2- and 3-level
Bernoulli models and for 2-level Poisson models through the Estimation Settings – HLM2
dialog box shown in Figure 8.3.
ht
• If desired, an over-dispersion option is available for binomial and Poisson models. This
option is not available with Laplace (see Figure 8.3). To specify over-dispersion, set the σ 2
value to computed in the Estimation Settings – HLM2 dialog box (see Figure 8.3).
• As mentioned, the nonlinear analysis is doubly iterative so the maximum number of macro
iterations can be specified as well as the maximum number of micro iterations. Similarly,
convergence criteria can be reset for macro iterations as well as micro iterations.2 The
©
number of iterations and method of estimation is set through the Iteration Control – HLM2
dialog box shown in Figure 8.2.
SS
2
I
The overall accuracy of the parameter estimates is determined by the convergence criterion for
macro iterations. The convergence criterion for micro iterations will influence the number of micro
iterations per macro iteration. The default specifications stop macro iterations when the largest
parameter estimate change is less than 10-4; micro iterations within macro iterations stop when the
conditional log likelihood (conditional on the current weights and values of the linearized dependent
variable) changes by less than 10-6.
121
Co
py
rig
Figure 8.1 Basic Model Specifications – HLM2 dialog box
ht
©
SS
122
Co
py
rig
Figure 8.3 Estimation Settings – HLM2 dialog box
Below we provide two detailed examples of nonlinear analyses: the first uses the Bernoulli model,
that is, a binomial model with the number of trials, mi j , equal to one. The second example uses a
binomial model with mi j > 1 . The analogs of these two analyses for count data are, respectively, the
ht
Poisson model with equal exposure and the Poisson case with variable exposure (some brief notes
about these two applications are also included). Finally, we furnish two examples for multi-category
outcomes, one for multinomial data and one for ordinal data. Windows mode specification is
illustrated. See Appendix D for interactive and batch specification.
©
8.2 Case 1: a Bernoulli model
Data are from a national survey of primary education in Thailand (see Raudenbush & Bhumirat,
1992, for details), conducted in 1988, and yielding, for our analysis, complete data on 7516 sixth
graders nested within 356 primary schools. Of interest is the probability that a child will repeat a
grade during the primary years (REP1 = 1 if yes, 0 if no). It is hypothesized that the sex of the child
SS
(MALE = 1 if male, 0 of female), the child's pre-primary experience (PPED = 1 if yes, 0 if no), and
the school mean SES (MSESC) will be associated with the probability of repetition. Every level-1
record corresponds to a student, with a single binary outcome per student, so the model type is
Bernoulli. These data (level-1 and level-2) data files are UTHAIL1.SAV and THAI2.SAV.
I
123
Below are the Windows commands for specifying a Bernoulli model.
1. After specifying the outcome in the model specification window (REP1 in our example), click
the Outcome button at the top of the variable listbox to the left of the main HLM window to open
the Basic Model Specifications – HLM2 dialog box (See Figure 8.1).
Co
2. Select Bernoulli (0 or 1) as there is one binary outcome per level-1 unit.
3. (Optional) Specify the maximum number of macro and micro iterations by selecting the
Iteration Settings option from the Other Settings menu.
4. (Optional) Select Laplace approximation or Adaptive Gaussian iteration control from the
options on the Estimation Settings – HLM2 dialog box, which is accessed by selecting the
Estimation Settings options from the Other Settings menu (See sections 8.8 and 7.6.3).
py
The model described above is displayed in Figure 8.4 in both standard and mixed model notation.
The command file for the model is THAIU1.HLM.
rig
ht
©
SS
Below we provide a transcript of the messages that HLM2 sent to the iteration window during
computation of the results.
I
124
MACRO ITERATION 1
Macro iteration number 1 has converged after seven micro iterations. This macro iteration actually
computes the linear-model estimates (using the identity link function as if the level-1 errors were
assumed normal). These results are then transformed and input to start macro iteration 2, which is, in
fact, the first nonlinear iteration.
py
MACRO ITERATION 2
Macro iteration 2, the first nonlinear macro iteration, converged after twelve micro iterations.
.
.
©
MACRO ITERATION 8
MACRO ITERATION 9
Should you wish to terminate the iterations prior to convergence, enter cntl-c
The value of the likelihood function at iteration 1 = -1.011638E+004
The value of the likelihood function at iteration 2 = -1.010710E+004
The value of the likelihood function at iteration 3 = -1.010710E+004
Level-1 Model
Prob(REP1ij=1|βj) = φij
log[ φij /(1 - φij )] = ηij
rig
ηij = β0j + β1j*(MALEij) + β2j*(PPEDij)
φij
ηi j = log = β 0 j + β1 j ( MALE )i j + β 2 j ( PPED )i j
ht
1 − φ i j
Level-2 Model
β 0 j = γ 00 + γ 01 ( MSESC )i j + u0 j
β1 j = γ 10
β 2 j = γ 20 .
In the metric of the linearized dependent variable, the level-1 variance is the reciprocal of the
Bernoulli variance, φ i j (1 − φ i j ) .
126
Mixed Model
Three sets of output results appear below: those for the normal linear model with identity link
function, those for the unit-specific model with logit link function, and those for the population-
average model with logit link. Typically, only the latter 2 sets of results will be relevant for drawing
conclusions. The linear model with identity link is estimated simply to obtain starting values for the
estimation of the models with logit link.
Co
Final Results for Linear Model with the Identity Link Function
σ2 = 0.12181
τ
INTRCPT1,β0
py
0.01897
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 0.153756 0.010812 14.221 354 <0.001
ht
MSESC, γ01 -0.033414 0.022465 -1.487 354 0.138
For MALE slope, β1
INTRCPT2, γ10 0.054131 0.008330 6.498 7158 <0.001
For PPED slope, β2
INTRCPT2, γ20 -0.064613 0.010926 -5.914 7158 <0.001
τ
INTRCPT1,β0 1.29571
127
Final estimation of fixed effects: (Unit-specific model)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 -2.046961 0.093985 -21.780 354 <0.001
MSESC, γ01 -0.254412 0.193319 -1.316 354 0.189
For MALE slope, β1
INTRCPT2, γ10 0.508561 0.073935 6.879 7158 <0.001
For PPED slope, β2
Co
INTRCPT2, γ20 -0.594375 0.095962 -6.194 7158 <0.001
Odds Confidence
Fixed Effect Coefficient
Ratio Interval
For INTRCPT1, β0
INTRCPT2, γ00 -2.046961 0.129127 (0.107,0.155)
MSESC, γ01 -0.254412 0.775372 (0.530,1.134)
py
For MALE slope, β1
INTRCPT2, γ10 0.508561 1.662897 (1.439,1.922)
For PPED slope, β2
INTRCPT2, γ20 -0.594375 0.551908 (0.457,0.666)
rig
Final estimation of fixed effects
(Unit-specific model with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 -2.046961 0.094872 -21.576 354 <0.001
ht
MSESC, γ01 -0.254412 0.204048 -1.247 354 0.213
For MALE slope, β1
INTRCPT2, γ10 0.508561 0.075994 6.692 7158 <0.001
For PPED slope, β2
INTRCPT2, γ20 -0.594375 0.094840 -6.267 7158 <0.001
©
Odds Confidence
Fixed Effect Coefficient
Ratio Interval
For INTRCPT1, β0
INTRCPT2, γ00 -2.046961 0.129127 (0.107,0.156)
MSESC, γ01 -0.254412 0.775372 (0.519,1.158)
For MALE slope, β1
SS
INTRCPT2, γ10 0.508561 1.662897 (1.433,1.930)
For PPED slope, β2
INTRCPT2, γ20 -0.594375 0.551908 (0.458,0.665)
Standard Variance
I
128
Results for Population-Average Model
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00
Co
-1.748402 0.087969 -19.875 354 <0.001
MSESC, γ01 -0.283620 0.185179 -1.532 354 0.127
For MALE slope, β1
INTRCPT2, γ10 0.446546 0.066993 6.666 7158 <0.001
For PPED slope, β2
INTRCPT2, γ20 -0.536378 0.088479 -6.062 7158 <0.001
Odds Confidence
Fixed Effect Coefficient
py
Ratio Interval
For INTRCPT1, β0
INTRCPT2, γ00 -1.748402 0.174052 (0.146,0.207)
MSESC, γ01 -0.283620 0.753053 (0.523,1.084)
For MALE slope, β1
INTRCPT2, γ10 0.446546 1.562905 (1.371,1.782)
rig
For PPED slope, β2
INTRCPT2, γ20 -0.536378 0.584863 (0.492,0.696)
Notice that the results for the population-average model are quite similar to the results for the unit-
specific model except in the case of the intercept. The intercept in the population-average model in
this case is the expected log-odds of repetition for a person with values of zero on the predictors
ht
(and therefore, for a female without pre-primary experience attending a school of average SES). In
this case, this expected log-odds corresponds to a probability of 1/(1 + exp{1.748402}) = .148,
which is the "population-average" repetition rate for this group. In contrast, the unit-specific
intercept is the expected log-odds of repetition rate for the same kind of student, but one who attends
a school that not only has a mean SES of 0, but also has a random effect of zero (that is, a school
with a "typical" repetition rate for the school of its type). This conditional expected log-odds is -
©
2.046961, corresponding to a probability of 1/(1 + exp{2.046961}) = .114. Thus the probability of
repetition is lower in a school with a random effect of zero than the average in the population of
schools having mean SES of zero taken as a whole. This is a typical result. Population-average
probabilities will be closer to .50 (than will the corresponding unit-specific probabilities).
One final set of results is printed out: population-average results with robust standard errors (below).
SS
Note that the robust standard errors in this case are very similar to the model-based standard errors,
with a slight increase for the level-2 predictor and slight decreases for level-1 predictors. Results for
other data may not follow this pattern.
I
129
Final estimation of fixed effects
(Population-average model with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 -1.748402 0.082158 -21.281 354 <0.001
MSESC, γ01 -0.283620 0.196005 -1.447 354 0.149
For MALE slope, β1
INTRCPT2, γ10 0.446546 0.062788 7.112 7158 <0.001
Co
For PPED slope, β2
INTRCPT2, γ20 -0.536378 0.082221 -6.524 7158 <0.001
Odds Confidence
Fixed Effect Coefficient
Ratio Interval
For INTRCPT1, β0
INTRCPT2, γ00 -1.748402 0.174052 (0.148,0.205)
py
MSESC, γ01 -0.283620 0.753053 (0.512,1.107)
For MALE slope, β1
INTRCPT2, γ10 0.446546 1.562905 (1.382,1.768)
For PPED slope, β2
INTRCPT2, γ20 -0.536378 0.584863 (0.498,0.687)
rig
8.3 Case 2: a binomial model (number of trials, mi j ≥ 1)
A familiar example of two-level binomial data is the number of hits, Yi j , in game i for baseball
player j based on mi j at bats. In an experimental setting, a subject j under condition i might produce
ht
Yi j successes in mi j trials.
A common use of a binomial model is when analysts do not have access to the raw data at level 1.
For example, one might know the proportion of children passing a criterion-referenced test within
each of many schools. This proportion might be broken down within schools by sex and grade. A
binomial model could be used to analyze such data. The cases would be sex-by-age "cells" within
©
each school where Yi j is the number passing within cell i of school j and mi j is the number of
"trials," that is, the number of children in that cell. Sex and grade would be level-1 predictors.
Indeed, in the previous example, although raw level-1 data were available, the two level-1
predictors, MALE and pre-primary experience, were categorical. For illustration, we reorganized
these data so that each school had, potentially, four cells defined by the cross-classification of sex
SS
Level-1 predictors were the same as before, with MALE = 1 if male, 0 if female; PPED = 1 if pre-
primary experience, 0 if not. The outcome is the number of children in a particular cell who repeated
130
a grade, and we created a variable TRIAL, which is the number of children in each cell. In some
schools there were no children of a certain type (e.g., no females with pre-primary experience). Such
schools would have fewer than four cells. The necessary steps for executing the analysis via the
Windows interface are given below.
131
Summary of the model specified
Level-1 Model
E (Yi j | β j ) = mi jφ i j
Var (Yi j | β j ) = mi jφ i j (1 − φ i j ),
py
where mi j = TRIAL.
Level-2 Model
In the metric of the linearized dependent variable, the level-1 variance is the reciprocal of the
©
binomial variance,
mi jφ i j (1 − φ i j ).
Results for the unit-specific model, population-average model, and population-average model with
robust standard errors, are not printed below. They are essentially identical to the results using the
SS
Bernoulli model.
Suppose that the outcome variable in Case 1 had been the number of days absent during the previous
year rather than grade repetition. This outcome would be a non-negative integer, that is, a count
I
rather than a dichotomy. Thus, the Poisson model with a log link would be a reasonable choice for
the model. Notice that the time interval during which the absences could accumulate, that is, one
year, would be the same for each student. We call this a case of "equal exposure," meaning that each
level-1 case had an "equal opportunity" to accumulate absences. (Case 4 describes an example
where exposure varies across level-1 cases.)
132
This model has exactly the same logic as in Case 1 except that the type of model and therefore the
corresponding link function will be different.
1. After specifying the outcome in the model specification window (REP1 in our example), click
Co
the Outcome button at the top of the variable listbox to the left of the main HLM window to open
the Basic Model Specifications – HLM2 dialog box (See Figure 8.1).
2. Select Poisson (constant exposure) to tell HLM that the level-1 sampling model is Poisson
with equal exposure per level-1 case.
3. (Optional) Specify the maximum number of macro and micro iterations by selecting the
Iteration Settings option from the Other Settings menu.
py
4. (Optional) Select the Over-dispersion option if appropriate (See section on Additional
Features at the end of the chapter).
E(REP1ij|βj) = λij
log[λij] = ηij
ht
The above equation, written with subscripts and Greek letters, is
E (Yi j | β j ) = λi j
Var (Yi j | β j ) = λi j
©
where λi j is the "true" rate of absence for child ij.
Level-2 Model
SS
Notice that the log link replaces the logit link when we have count data. In the example above, β 2 is
the expected difference in log-absenteeism between two children of the same sex attending the same
school. To translate back to the rate of absenteeism, we would expect a child with pre-primary
experience to have exp { β 2 } times the absenteeism rate of a child attending the same school who
133
did not have pre-primary experience (holding sex constant). In this particular case, the estimated
effect for β 2 is most plausibly negative; exp { β 2 } is less than 1.0 so that pre-primary experience
would reduce the rate of absenteeism. Notice that the level-2 structural models are identical to those
in Case 1.
Notice that the level-1 and level-2 structural models are identical to those in Case 1.
1. After specifying the outcome in the model specification window (REP1 in our example), click
the Outcome button at the top of the variable listbox to the left of the main HLM window to open
ht
the Basic Model Specifications – HLM2 dialog box (See Figure 8.1).
2. Select Poisson (variable exposure) to tell HLM that the level-1 sampling model is Poisson with
variable exposure per level-1 case.
3. Select the variable that indicates variable exposure from the drop-down listbox (See Figure 8.1).
(In the illustration below, we use TRIAL as the variable to indicate variable exposure).
©
4. (Optional) Specify the maximum number of macro and micro iterations by selecting the
Iteration Settings option from the Other Settings menu.
5. (Optional) Select the Over-dispersion option if appropriate (See section on Additional
Features at the end of the chapter).
SS
Level-1 Model
This is the program's way of saying that the level-1 sampling model is Poisson with variable
exposure per level-1 case, so that the above equation, written with subscripts and Greek letters, is
134
E (Yi j | β j ) = mi j λi j
Var (Yi j | β j ) = mi j λi j ,
Notice that the log link replaces the logit link when we have count data.
Level-2 Model
Notice that the level-1 and level-2 structural models are identical to those in Case 1.
An outcome with three response categories tapping teachers' commitment to their career choice is
ht
derived from teachers' responses to the hypothetical question of whether they would become a
teacher if they could go back to college and start over again. The possible responses are:
At the teacher level, it is hypothesized that teachers' perception of task variety is positively
associated with greater odds of a teacher choosing the first category relative to the third category,
and with greater odds of a teacher choosing the second category relative to the third category. The
SS
perception is measured by a task variety scale that assessed the extent to which teachers followed the
same teaching routines each day, performed the same tasks each day, had something new happening
in their job each day, and liked the variety present in their work (Rowan, Raudenbush & Cheong,
1993).
I
At the school level, it is postulated that the extent of teacher control has the same relationship to the
two log odds as perception of task variety does. The teacher control scale is constructed by
aggregating nine-item scale scores of teachers within a school. This scale indicates teacher control
over school policy issues such as student behavior codes, content of in-service programs, student
grouping, school curriculum, and text selection; and control over classroom issues such as teaching
content and techniques, and amount of homework assigned (Rowan, Raudenbush & Kang, 1991).
135
As a previous analysis showed that there is little between-teacher variability in their log-odds of
choosing the second category relative to the third category, the level-1 coefficient associated with it
is fixed. Furthermore, the effects associated with perception of task variety are constrained to be the
same across teachers for the sake of parsimony.
The general procedure to specify a multinomial logit model is given below. Note that the
Co
multinomial and ordinal analyses provide unit-specific estimates only. They do not currently
produce population-average estimates.
1. After specifying the outcome in the model specification window, click the Outcome button at
py
the top of the variable listbox to the left of the main HLM window to open the Basic Model
Specifications – HLM2 dialog box (See Figure 8.1).
2. Select Multinomial to tell HLM that the level-1 sampling model is multinomial.
3. Enter the number of categories into the Number of Categories box.
4. (Optional) Specify the maximum number of macro and micro iterations by selecting the
rig
Iteration Settings option from the Other Settings menu.
Level-1 Model
Prob[TCOMMIT(1) = 1|βj] = φ1
I
ij
136
log[ φ1ij / φ1ij ] = β0j(1) + β1j(1)*(TASKVARij)
φ i j (1)
ηi j (1) = log = β 0 j (1) + β1 j (1) (TASKVAR)i j
φ i j (3)
SS
φ i j (2)
ηi j (2) = log = β 0 j (2) + β1 j (2) (TASKVAR)i j
φ i j (3)
Level-2 Model
I
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
rig
error d.f.
For Category 1
For INTRCPT1, β0(1)
INTRCPT2, γ00(1) 1.079269 0.123439 8.743 14 <0.001
TCONTROL, γ01(1) 2.090207 0.508369 4.112 14 0.001
For TASKVAR slope, β1(1)
<0.001
INTRCPT2, γ10(1)
ht
0.398355 0.113650 3.505 630
For Category 2
For INTRCPT1, β0(2)
INTRCPT2, γ00(2) 0.091930 0.141643 0.649 630 0.517
TCONTROL, γ01(2) 1.057285 0.577673 1.830 630 0.068
For TASKVAR slope, β1(2)
INTRCPT2, γ10(2) 0.030693 0.130029 0.236 630 0.813
©
γ00(1), the unit-specific intercept, is the expected log-odds of an affirmative response relative to a
negative response for a teacher with mean perception of task variety and working in a school with
average teacher control and a random effect of zero. It is adjusted for the between-school heterogeneity
in the likelihood of an affirmative response relative to a negative response, which is independent of the
effect of task variety and teacher control. The estimated conditional expected log-odds is 1.079269.
SS
The predicted probability that the same teacher responds affirmatively (Category 1) is exp{1.079269}/
(1 + exp{1.079269} + exp{0.091930}) = .584. The predicted probability of responding "not sure"
(category 2) is exp{0.091930}/(1 + exp{1.079269} + exp{0.091930}) = 1 - .584 - .218 = .198.
I
138
Odds Confidence
Fixed Effect Coefficient
Ratio Interval
For Category 1
For INTRCPT1, β0(1)
INTRCPT2, γ00(1) 1.079269 2.942528 (2.258,3.835)
TCONTROL, γ01(1) 2.090207 8.086586 (2.718,24.063)
For TASKVAR slope, β1(1)
(1.191,1.862)
INTRCPT2, γ10(1)
0.398355 1.489373
Co
For Category 2
For INTRCPT1, β0(2)
INTRCPT2, γ00(2) 0.091930 1.096288 (0.830,1.448)
TCONTROL, γ01(2) 1.057285 2.878545 (0.926,8.952)
For TASKVAR slope, β1(2)
INTRCPT2, γ10(2) 0.030693 1.031169 (0.799,1.331)
py
The sets of γ 01 and γ 10 give the estimates of the change in the respective log-odds given one-unit
change in the predictors, holding all other variables constant. For instance, all else being equal, a
standard deviation increase in TCONTROL (.32) will nearly double the odds of an affirmative
response to a negative response (exp{2.090207 * .32} = 1.952). Note that the partial effect
rig
associated with perception of task variety is statistically significant for the logit of affirmative versus
negative responses but not for the logit of undecided versus negative responses.
Below is a table for the results for the fixed effects with robust standard errors.
Final estimation of fixed effects
ht
(with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For Category 1
For INTRCPT1, β0(1)
INTRCPT2, γ00(1) 1.079269 0.128263 8.415 14 <0.001
TCONTROL, γ01(1) 2.090207 0.409607 5.103 14 <0.001
©
For TASKVAR slope, β1(1)
0.002
INTRCPT2, γ10(1)
0.398355 0.127511 3.124 630
For Category 2
For INTRCPT1, β0(2)
INTRCPT2, γ00(2) 0.091930 0.139637 0.658 630 0.511
TCONTROL, γ01(2) 1.057285 0.529606 1.996 630 0.046
SS
139
Odds Confidence
Fixed Effect Coefficient
Ratio Interval
For Category 1
For INTRCPT1, β0(1)
INTRCPT2, γ00(1) 1.079269 2.942528 (2.235,3.874)
TCONTROL, γ01(1) 2.090207 8.086586 (3.359,19.469)
For TASKVAR slope, β1(1)
(1.159,1.913)
INTRCPT2, γ10(1)
0.398355 1.489373
Co
For Category 2
For INTRCPT1, β0(2)
INTRCPT2, γ00(2) 0.091930 1.096288 (0.833,1.442)
TCONTROL, γ01(2) 1.057285 2.878545 (1.017,8.145)
For TASKVAR slope, β1(2)
INTRCPT2, γ10(2) 0.030693 1.031169 (0.804,1.322)
py
The robust standard errors are appropriate for datasets having a moderate to
large number of level 2 units. These data do not meet this criterion.
Standard Variance
Random Effect d.f. χ2 p-value
rig
Deviation Component
INTRCPT1(1), u0(1) 0.09931 0.00986 14 16.16473 0.303
Note that the residual variance of β 00(1) is not statistically different from zero. The model may be re-
run with the coefficient set to be non-random.
ht
8.7 Case 6: Ordinal model
The same data set, the multi-category outcome, and the same predictors in Case 5 are used here. The
procedure for specifying an ordinal model is very similar to that of a multinomial model. Select the
Ordinal instead of Multinomial option in the Basic Model Specifications – HLM2 dialog box (See
©
Figure 8.1). Figure 8.7 displays the model specified for the example (TCHR2.HLM).
Note: The multinomial and ordinal analyses currently produce unit-specific results only. They do
not provide population-average results.
SS
I
140
Co
py
rig
Figure 8.7 Model specification window for the ordinal model
ht
The output obtained for this model follows.
Specifications for this ordinal HLM run
Level-1 Model
I
141
φ1ij = Prob[TCOMMIT(1) = 1|βj]
φ2ij = Prob[TCOMMIT(2) = 1|βj]
log[ φ1ij /(1 - φ1ij )] = β0j + β1j*(TASKVARij)
log[ φ2ij /(1 - φ2ij )] = β0j + β1j*(TASKVARij) + δ2
φ i' j (2)
ηi' j (2) = = β 0 j + β1 j (TASKVAR)i j + δ (2) .
1 − φ i j (2)
'
py
Level-2 Model
τ
INTRCPT1,β0 0.00010
SS
142
Final estimation of fixed effects:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1 slope, β0
INTRCPT2, γ00 0.333918 0.089735 3.721 14 0.002
TCONTROL, γ01 1.541051 0.365624 4.215 14 <0.001
For TASKVAR slope, β1
INTRCPT2, γ10 0.348801 0.087280 3.996 633 <0.001
For THOLD2,
Co
δ2 1.054888 0.080868 13.045 633 <0.001
Odds Confidence
Fixed Effect Coefficient
Ratio Interval
For INTRCPT1 slope, β0
INTRCPT2, γ00 0.333918 1.396429 (1.152,1.693)
TCONTROL, γ01 1.541051 4.669496 (2.131,10.230)
py
For TASKVAR slope, β1
INTRCPT2, γ10 0.348801 1.417367 (1.194,1.682)
For THOLD2,
δ2 1.054888 2.871653 (2.450,3.366)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1 slope, β0
SS
143
Odds Confidence
Fixed Effect Coefficient
Ratio Interval
For INTRCPT1 slope, β0
INTRCPT2, γ00 0.333918 1.396429 (1.145,1.704)
TCONTROL, γ01 1.541051 4.669496 (2.247,9.702)
For TASKVAR slope, β1
INTRCPT2, γ10 0.348801 1.417367 (1.182,1.699)
For THOLD2,
Co
δ2 1.054888 2.871653 (2.452,3.363)
The robust standard errors are appropriate for datasets having a moderate to
large number of level 2 units. These data do not meet this criterion.
Standard Variance
Random Effect d.f. χ2 p-value
py
Deviation Component
INTRCPT1, u0 0.01016 0.00010 14 14.57034 0.408
Note that the residual variance of β 00(1) is not statistically different from zero. In fact, it is very close
to zero, which accounts for the large number of iterations required to achieve convergence. The
model may be re-run with the coefficient set to be non-random.
rig
8.8 Additional features
8.8.1 Over-dispersion
ht
For binomial models with mi j > 1 and for all Poisson models, there is an option to estimate a level-1
dispersion parameter σ 2 (See Figure 8.1). If the assumption of no dispersion holds, σ 2 = 1.0 . If the
data are over-dispersed, σ 2 > 1.0 ; if the data are under-dispersed, σ 2 < 1.0 .
τ
INTRCPT1,β0 1.61733
144
Random level-1 coefficient Reliability estimate
INTRCPT1,β0 0.724
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
Co
For INTRCPT1, β0
INTRCPT2, γ00 -2.239223 0.100384 -22.307 354 <0.001
MSESC, γ01 -0.297322 0.200573 -1.482 354 0.139
For MALE slope, β1
INTRCPT2, γ10 0.533635 0.072623 7.348 7158 <0.001
For PPED slope, β2
INTRCPT2, γ20 -0.626218 0.099789 -6.275 7158 <0.001
py
Odds Confidence
Fixed Effect Coefficient
Ratio Interval
For INTRCPT1, β0
INTRCPT2, γ00 -2.239223 0.106541 (0.087,0.130)
MSESC, γ01 -0.297322 0.742805 (0.501,1.102)
For MALE slope, β1
rig
INTRCPT2, γ10 0.533635 1.705119 (1.479,1.966)
For PPED slope, β2
INTRCPT2, γ20 -0.626218 0.534610 (0.440,0.650)
τ
©
INTRCPT1,β0 1.68320
Standard error of τ
INTRCPT1,β0 0.20904
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 -2.242961 0.106249 -21.110 354 <0.001
MSESC, γ01 -0.295119 0.215888 -1.367 354 0.172
For MALE slope, β1
I
145
Statistics for the current model
Deviance = 19255.057516
Number of estimated parameters = 5
For simplicity of exposition, all of the examples above have used the two-level HGLM. These
py
procedures generalize directly to three-and four-level applications. Again the type of nonlinear
model desired at level-1 must be specified. There are now, however, structural models at both
levels 2 and 3 as in the case of HLM3. The same idea applies to HLM4.
rig
ht
©
SS
I
146
9 Conceptual and Statistical Background for Hierarchical
Multivariate Linear Models (HMLM)
One of the most frequent applications of hierarchical models involves repeated observations (level
Co
1) nested within persons (level 2). These are described in Chapter 6 of Hierarchical Linear Models.
In these models, the outcome Yi j for occasion i within person j is conceived as a univariate outcome,
observed under different conditions or at different times. An advantage of viewing the repeated
observations as nested within the person is that it allows each person to have a different repeated
measures design. For example, in a longitudinal study, the number of time points may vary across
persons, and the spacing between time points may be different for different persons. Such
py
unbalanced designs would pose problems for standard methods of analysis such as the analysis of
variance.
Suppose, however, that the aim of the study is to observe every participant according to a fixed
design with, say, T observations per person. The design might involve T observation times or T
different outcome variables or even T different experimental conditions. Given the fixed design, the
rig
analysis can be reconceived as a multivariate repeated measures analysis. The multivariate model is
flexible in allowing a wide variety of assumptions about the variation and covariation of the T
repeated measures (Bock, 1985). In the standard application of multivariate repeated measures, there
can be no missing outcomes: every participant must have a full complement of T repeated
observations.
ht
Advances in statistical computation, beginning with the EM algorithm (Dempster, Laird, & Rubin,
1977; see also Jennrich & Schluchter, 1986), allow the estimation of multivariate normal models
from incomplete data. In this case, the aim of the study was to collect T observations per person, but
only n j observations were collected ( n j ≤ T ). These n j observations are indeed collected according
to a fixed design, but T − n j data points are missing at random.
©
HMLM allows estimation of multivariate normal models from incomplete data; HMLM2 allows for
study of multivariate outcomes for persons who are, in turn, nested within higher-level units. Within
the framework of HMLM, it is possible to estimate models having
SS
5. A model with first-order auto-regressive level-1 random errors and random intercepts and/or
slopes at level 2.
We note that applications 2 - 4 are available within the standard HLM2. However, within HMLM,
models 2 - 4 can be compared to the unrestricted model (model 1), using a likelihood ratio test. No
147
"unrestricted model" can be meaningfully defined within the standard HLM2; such a model is
definable only within the confines of a fixed design with T measurements.
HMLM2 allows the five models listed above to be embedded within a nested structure, e.g., the
persons who are repeatedly observed may be nested within schools.
T
Yhi = mthiYti* (9.1)
t =1
rig
where Yhi is the r-th outcome for person i associated with time h. Here Yti* is the value that person i
would have displayed if that person had been observed at time t, and mthi is an indicator variable
taking on a value of 1 if the h-th measurement for person i did occur at time t, 0 if not. Thus, Yti* , t =
ht
1, ..., T, represent the complete data for person i while Yhi , h = 1, ..., Ti are the observed data, and
the indicators mthi tell us the pattern of missing data for person i.
To make this clear, consider T = 5 and a person who has data at occasions 1,2, and 4, but not at
occasions 3 and 5. Then Equation 9.1 expands to
©
Y1*i
Y1i 1 0 0 0 0 Y2*i
*
Y2i = 0 1 0 0 0 Y3i (9.2)
Y 0 0 0 1 0 Y *
3i 4i
SS
Y *
5i
Yi = M iYi * (9.3)
I
This model says simply that the three observed data points for person i were observed at times 1, 2,
and 4, so that data were missing at times 3 and 5. Although these data were missing, they do exist, in
principle. Thus, every participant has a full 5 ×1 vector of "complete data" even though the Ti ×1
vector of observed data will vary in length across persons.
148
We now pose a structural model for the within-person variation in Y * :
P
Yti* = π 0i + π pi a pt + ε ti (9.4)
p =1
where we assume that εi is multivariate normal in distribution with a mean vector of 0 and an
arbitrary T × T covariance matrix Δ . In fact, Δ is not a "within-person" covariance. Rather, it
captures all variation and covariation among the T repeated observations.
py
9.1.2 Level-2 model
The level-2 model includes covariates, X i , that vary between persons:
rig
Q
π pi = β p 0 + β pq X qi (9.6)
q=1
or in matrix notation
πi = X i β
ht
(9.7)
Note there is no random variation between persons in the regression coefficients π pi because all
random variation has been absorbed into Δ (see the text below Equation 9.5).
©
9.1.3 Combined model
Substituting the level-2 model into the level-1 model gives the combined model for the complete
data, in matrix form:
Yi * = AX i β + εi , εi ~ N (0, Δ) (9.8)
SS
Here the design matrix captures main effects of within-person covariates (the as), main effects of
person-level covariates (Xs), and two-way interaction effects between them ( a × X terms).
In sum, our reformulation poses a "multiple measures" model (Equation 9.3) that relates the
I
observed data Yi to the "complete data" Yi* , that is, the data that would have been observed if the
researcher had been successful in obtaining outcome data at every time point. Our combined model
is a standard multivariate normal regression model for the complete data.
149
Algebraically substituting the combined model expression for Yi* into the model for the observed
data (Equation 9.3) yields the combined model
Yi = M i AX i β + M i εi . (9.9)
Under the unrestricted model, the number of parameters estimated is f + T (T + 1) / 2 , where f is the
number of fixed effects and T is the number of observations intended for each person. The models
Co
below impose constraints on the unrestricted model, and therefore include fewer parameters. The fit
of these simpler models to the data can be compared to the fit of the unrestricted model using a
likelihood ratio test.
Under the special case in which the within-person design is fixed 1 ,3with T observations per person
py
and randomly missing time points, the two-level HLM can be derived from the unrestricted model by
imposing restrictions on the covariance matrix, Δ . (Note: regressors Ai having varying designs may
be included in the level-1 model, but coefficients associated with such Ai values must not have
random effects at level 2). The most frequently used assumption in the standard HLM is that the
rig
within-person residuals are independent with a constant variance, σ 2 .
with Σ = σ 2 IT .
Qp
π pi = β p 0 + X qi β pq + rqi (9.11)
SS
q=1
or in matrix notation
π i = X i β + ri (9.12)
I
All of the usual forms are now available for the intercepts and slopes (fixed, randomly varying, non-
randomly varying), provided T is large enough.
Yi * = AX i β + Ai ri + ei
(9.13)
= AX i β + ε i ,
Co
where ε i = Ari + ei has variance-covariance matrix
Var ( ε i ) = Var ( Ari + ei )
(9.14)
= AτA' + σ 2 IT = Δ.
Under the HLM with homogenous level-1 variance, the number of parameters estimated is
py
f + r (r + 1) / 2 + 1 , where r is the dimension of τ . Thus, r must be less than T.
that is, Σ is now diagonal with elements σ t2 , the variance associated with occasion t, t = 1, …, T.
now
L
log (σ t2 ) = α 0 + α I cI t . (9.16)
I =1
I
Thus, the natural log of the level-1 variance may be a linear or quadratic function of age. If the
explanatory variables cI are T − 1 dummy variables, each indicating the occasion of measurement,
the results will duplicate those of the previous section.
151
The number of parameters estimated is now f + r (r + 1) / 2 + L + 1 . Again, r must be no larger than
T − 1 and L must be no larger than T − 1 .
Cov(e ti , e t 'i ) = σ 2 ρ |t −t | .
'
(9.17)
Thus, the variance at each time point is σ 2 and each correlation diminishes with the distance
between time points, so that the correlations are ρ , ρ 2 , ρ 3 , ... as the distance between occasions is 1,
py
2, 3, .... The number of parameters estimated is now f + r (r + 1) / 2 + 2 . Again, r must be no larger
than T − 1 .
Note that level-1 predictors are assumed to have the same values for all level-2 units of the complete
rig
data. This assumption can be relaxed. However, if the design for a pt i varies over i, its coefficient
cannot vary randomly at level 2. In this regard, the standard 2-level model (See Chapters 2, 3) is
more flexible than HMLM.
T
Yhij = m thijYtij* . (9.18)
t =1
SS
Here individual i is nested within group j (j = 1, …, J) and we have Yhij , the h-th outcome observed
for person i in group j. Here Ytij* is the value that person i would have displayed if that person had
been observed at time t, and m thij is an indicator variable taking on a value of 1 if the h-th
measurement for that person did occur at time t, 0 if not. Thus Ytij* , t = 1, …, T represent the
I
complete data for person i in group j while Yhij , h = 1, …, Ti are the observed data, and the
indicators m thij tell us the pattern of the missing data. Again, we pose a structural model for the
within-person variation in Y * :
152
P
Ytij* = π 0ij + π pij a pt + e tij , (9.19)
p =1
Q pq
πij = X ij β j (9.22)
ht
9.5.3 Level-3 model
Now the coefficients defined on persons (in the level-2 model) are specified as possibly varying at
level-3 over groups:
©
S pq
β p q j = γ p q 0 + γ p q sW sq j +u p q j . (9.23)
s=1
Here the vector u j , composed of elements u pqj is multivariate normal in distribution with a zero
mean vector and covariance matrix τ β .
SS
153
where
Note that level-1 predictors a pt are assumed to have the same values for all level-2 units of the
complete data. This assumption can be relaxed. However, if the design for a ptij varies over i and j,
py
the coefficient for a ptij , that is π ptij , must have no random effect at level 2. In this regard, the
standard three-level model (see Chapters 3 and 4) is more flexible than is HMLM2.
rig
ht
©
SS
I
154
10 Working with HMLM/HMLM2
Like the other programs, HMLM and HMLM2 execute analyses using MDM (multivariate data matrix)
files, which consist of the combined level-1 and level-2 data files.
Co
The procedures for constructing the MDM file are similar to the ones for HLM2 and HLM3 with one
major difference: the user has to create and input indicator variables for the outcome(s) while
constructing the MDM file. Model specification for HMLM and HMLM2 involves the same mechanics
as in HLM2 and HLM3 with an extra step of model covariance structure selection.
Below we provide two examples using data sets from the first cohort of the National Youth Survey
py
(Elliot, Huizinga, & Menard, 1989, Raudenbush, 1999) and the time-series observations on 1,721
students nested within 60 public primary schools as described in Chapter 8. Windows mode
execution is illustrated. See Appendix E for interactive and batch mode execution.
The range of options for data input are the same as for HLM2 and HLM3. We will use SPSS file input
in our example.
ht
10.1.1.1 Level-1 file
The level-1 file, NYS1.SAV, has 1,079 observations collected from interviewing annually 239 eleven-
year-old youths beginning at 1976 for five consecutive years. Therefore, T = 5. The variables and the
T indicator variables are:
©
ATTIT a 9-item scale assessing attitudes favorable to deviant behavior.
Subjects were asked how wrong (very wrong, wrong, a little bit wrong, not wrong
at all) they believe it is for someone their age to, for example, damage and destroy
property, use marijuana, use alcohol, sell hard drugs, or steal.
SS
Subjects were asked how wrong their best friends thought the nine deviant
behaviors surveyed in the ATTIT scale were.
155
AGE13 age of participant at a specific time minus 13
AGE11s AGE11* AGE11
AGE13s AGE13* AGE13
IND1 indicator for measure at time 1
IND2 indicator for measure at time 2
IND3 indicator for measure at time 3
IND4 indicator for measure at time 4
IND5 indicator for measure at time 5
Co
The five indicators were created to facilitate use of HMLM. Data for the first two children are shown
in Fig. 10.1.
Child 15 had data at all five years. Child 33, however, did not have data for the fourth year.
py
rig
ht
The level-2 data file, NYSB.SAV, consists of three variables on 239 youths. The file has the same
structure as that for HLM2. The variables are:
SS
156
The steps are very similar to the ones described in Section 2.5.1. Select HMLM as the MDM type at
the Select MDM type dialog box (see Figure 2.4) and inform WHLM the type of data input.
While the structure of HMLM input files is almost the same as in HLM2, there is one important
difference: the indicator variables. In order to create these, one first needs to know the maximum
number of level-1 records per level-2 group; this determines the number of indicators. We shall call
them the number of "occasions." (This is the number of time points in a repeated measures study or
the number of outcome variables in a cross-sectional multivariate study. Also note that each person
Co
does not need to have this number of occasions.) Then create the indicator variables so that a given
variable takes on the value of 1.0 if the given occasion is at this time point, 0.0 otherwise. Looking
at Figure 10.1, we see that IND1 is 1 if AGE11 is 0, IND2 is 1 if AGE11 is 1, IND3 is 1 if AGE11 is 2,
and so on. Fig 10.2 shows the Choose variables – HMLM dialog box where the indicator variables
are checked before the MDM file is created. This dialog box can be opened from the Level-1
specification section in the Make MDM – HMLM dialog box.
py
rig
ht
©
The steps involved are similar to the ones for HLM2 as described on Section 2.5.2. It is necessary to
specify
Under HMLM, level-1 predictors having random effects must have the same value for all participants
at a given occasion. If the user specifies a predictor not fulfilling this condition to have a random
effect, such coefficients will be automatically set as non-random by the program. Furthermore, an
157
extra step for selecting the covariance structure for the models to be estimated is needed. Figure 10.3
displays the model specified for our example. Figure 10.4 shows the dialog box where the
covariance structure is selected.
• an unrestricted model,
• the homogeneous model, σ t2 = σ 2 for all t, and
• the heterogeneous model, which allows σ t2 to vary over time.
py
rig
ht
©
SS
These three models are requested simply by checking the Heterogeneous option in the Basic
I
158
Co
py
Figure 10.4 Basic Model Specifications - HMLM dialog box
rig
Similarly, checking the Log-linear button will produce output on:
In this case a modified model will be displayed, as shown in Fig. 10.5. To obtain this model, the
Predictors of level-1 variance dialog box was used to select the variable EXPO.
©
SS
I
159
Co
py
rig
ht
©
Figure 10.5 Model specification window for the NYS example: loglinear model selection
And, again similarly, choosing the 1st order auto-regressive option will produce unrestricted and
homogeneous results in addition to first-order auto-regressive results.
SS
160
Level-1 Level-2
Coefficients Predictors
INTRCPT1, π0 INTRCPT2, β00
# AGE13 slope, π1 INTRCPT2, β10
# AGE13S slope, π2 INTRCPT2, β20
'#' - The residual parameter variance for this level-1 coefficient has been set
to zero.
Co
Output for the Unrestricted Model
Level-1 Model
π0i = β00
π1i = β10
π2i = β20
For the restricted model, there is no random variation between persons in regression coefficient β 0 ,
ht
β1 , and β 2 because all random variation has been absorbed into Δ.
Var(εi) = Δ
Δ(0)
©
IND1 0.03507 0.01671 0.01889 0.02149 0.02486
IND2 0.01671 0.04458 0.02779 0.02468 0.02714
IND3 0.01889 0.02779 0.07272 0.05303 0.04801
IND4 0.02149 0.02468 0.05303 0.08574 0.06636
IND5 0.02486 0.02714 0.04801 0.06636 0.08985
The 5 × 5 matrix Δ contains the maximum likelihood estimates of the five variances (one for each
SS
time point) and ten covariances (one for each pair of time points). The associated correlation matrix
is printed below.
Standard errors of Δ
IND1 0.00347 0.00304 0.00375 0.00413 0.00429
IND2 0.00304 0.00434 0.00430 0.00457 0.00473
I
161
Δ (as correlations)
IND1 1.000 0.423 0.374 0.392 0.443
IND2 0.423 1.000 0.488 0.399 0.429
IND3 0.374 0.488 1.000 0.672 0.594
IND4 0.392 0.399 0.672 1.000 0.756
IND5 0.443 0.429 0.594 0.756 1.000
The 5 × 5 matrix above contains estimated standard errors for each element of Δ.
Co
The value of the log-likelihood function at iteration 8 = 1.891335E+002
Standard t- Approx. p-
Fixed Effect Coefficient
error ratio d.f. value
For INTRCPT1, π0
INTRCPT2, β00
py
0.320244 0.014981 21.377 238 <0.001
For AGE13 slope, π1
INTRCPT2, β10 0.059335 0.004710 12.598 238 <0.001
For AGE13S slope, π2
INTRCPT2, β20 0.000330 0.003146 0.105 238 0.917
The expected log attitude at age 13 is 0.320244. The mean linear growth rate of increase is estimated
rig
to be 0.059335, t = 12.598, indicating a highly significantly positive average rate of increase in
deviant attitude at age 13. The quadratic rate is not statistically significant.
Deviance = -378.266936
ht
Number of estimated parameters = 18
There are 3 fixed effects (f = 3) and five observations in the "complete data" for each person (T =
5). Thus, there are a total of f + T (T + 1) / 2 = 3 + 5(5 + 1) / 2 = 18 parameters. This is the end of the
unrestricted model output.
©
Next follows the results for the homogeneous level-1 variance.
Level-1 Model
SS
162
Level-2 Model
A
IND1 1.00000 -2.00000 4.00000
py
IND2 1.00000 -1.00000 1.00000
IND3 1.00000 0.00000 0.00000
IND4 1.00000 1.00000 1.00000
IND5 1.00000 2.00000 4.00000
The above matrix describes the design matrix on occasions one through five.
rig
Iterations stopped due to small change in likelihood function
Note: The results below duplicate exactly the results produced by a standard HLM2 run using
homogeneous level-1 variance.
ht
Final Results - Iteration 5
τ
INTRCPT1,r0 0.04200 0.00808 -0.00242
©
AGE13,r1 0.00808 0.00277 -0.00012
AGE13S,r2 -0.00242 -0.00012 0.00049
Standard errors of τ
INTRCPT1,r0 0.00513 0.00127 0.00089
AGE13,r1 0.00127 0.00054 0.00024
SS
τ (as correlations)
INTRCPT1,r0 1.000 0.749 -0.532
AGE13,r1 0.749 1.000 -0.101
AGE13S,r2 -0.532 -0.101 1.000
I
163
Δ
IND1 0.03536 0.01388 0.01616 0.01801 0.01943
IND2 0.01388 0.04870 0.03150 0.03488 0.03464
IND3 0.01616 0.03150 0.06620 0.04766 0.04849
IND4 0.01801 0.03488 0.04766 0.08056 0.06095
IND5 0.01943 0.03464 0.04849 0.06095 0.09625
The 5 × 5 matrix above contains the five variance and ten covariance estimates implied by the
"homogeneous level-1 variance" model.
Co
Δ (as correlations)
IND1 1.000 0.334 0.334 0.338 0.333
IND2 0.334 1.000 0.555 0.557 0.506
IND3 0.334 0.555 1.000 0.653 0.607
IND4 0.338 0.557 0.653 1.000 0.692
IND5 0.333 0.506 0.607 0.692 1.000
py
The value of the log-likelihood function at iteration 5 = 1.741132E+002
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
rig
error d.f.
For INTRCPT1, π0
INTRCPT2, β00 0.327231 0.015306 21.379 238 <0.001
For AGE13 slope, π1
INTRCPT2, β10 0.064704 0.004926 13.135 238 <0.001
For AGE13S slope, π2
INTRCPT2, β20 0.000171 0.003218 0.053 238 0.958
ht
Statistics for the current model
Deviance = -348.226421
Number of estimated parameters = 10
There are 3 fixed effects (f = 3); the dimension of τ is 3, and a common σ 2 is estimated at level-1.
©
Thus, there are a total of f + r (r + 1) / 2 + 1 = 3 + 3(3 + 1) / 2 + 1 = 10 parameters.
This is the end of the output for the "homogeneous level-1 variance" model. Finally, the
heterogeneous level-1 variance solution is listed.
SS
Output for Random Effects Model with Heterogeneous Level-1 Variance
Level-1 Model
Level-2 Model
Var (Y * ) = ATA' + Σ
Co
where Σ = diag {σ t2 } , i.e. that is, Σ is now a diagonal matrix with diagonal elements σ t2 , the
variance associated with occasion t, t = 1, 2, …, T.
A
IND1 1.00000 -2.00000 4.00000
py
IND2 1.00000 -1.00000 1.00000
IND3 1.00000 0.00000 0.00000
IND4 1.00000 1.00000 1.00000
IND5 1.00000 2.00000 4.00000
Standard
σ2
Error
IND1 0.01373 0.005672
ht
IND2 0.02600 0.003296
IND3 0.02685 0.003658
IND4 0.02602 0.003633
IND5 0.00275 0.007377
The five estimates above are the estimates of the level-1 variance for each time point.
©
τ
INTRCPT1,r0 0.04079 0.00736 -0.00241
AGE13,r1 0.00736 0.00382 0.00025
AGE13S,r2 -0.00241 0.00025 0.00106
Standard errors of τ
SS
τ (as correlations)
INTRCPT1,r0 1.000 0.590 -0.366
I
165
Δ
IND1 0.03410 0.01707 0.01646 0.01851 0.02325
IND2 0.01707 0.05165 0.03103 0.03322 0.03223
IND3 0.01646 0.03103 0.06764 0.04574 0.04588
IND4 0.01851 0.03322 0.04574 0.08208 0.06421
IND5 0.02325 0.03223 0.04588 0.06421 0.08996
The 5 × 5 matrix above contains the estimates of five variances and ten covariances implied by the
"heterogeneous level-1 variance" model.
Co
Δ (as correlations)
IND1 1.000 0.407 0.343 0.350 0.420
IND2 0.407 1.000 0.525 0.510 0.473
IND3 0.343 0.525 1.000 0.614 0.588
IND4 0.350 0.510 0.614 1.000 0.747
IND5 0.420 0.473 0.588 0.747 1.000
py
The value of the log-likelihood function at iteration 8 = 1.816074E+002
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
rig
error d.f.
For INTRCPT1, π0
INTRCPT2, β00 0.327646 0.015252 21.482 238 <0.001
For AGE13 slope, π1
INTRCPT2, β10 0.060864 0.004737 12.849 238 <0.001
For AGE13S slope, π2
INTRCPT2, β20 -0.000541 0.003178 -0.170 238 0.865
ht
Statistics for the current model
Deviance = -363.214879
Number of estimated parameters = 14
©
There are 3 fixed effects (f = 3), the dimension of τ is 3, and there are five observations intended
for each person, each associated with a unique level-1 variance. Thus, there are a total of
f + r (r + 1) / 2 + T = 3 + 3(4) / 2 + 5 = 14 parameters.
The model deviances are employed to evaluate the fits of the three models (unrestricted,
homogeneous σ 2 , and heterogeneous σ 2 ). Differences between deviances are distributed
166
asymptotically as chi-square variates under the null hypothesis that the simpler model fits the data
as well as the more complex model does. The results show that Model 1 fits better than does the
homogeneous sigma_squared model χ 2 = 30.04052, df = 8; it also fits better than does the
heterogeneous sigma_squared model χ 2 = 15.05206, df = 4.
In addition to the evaluation of models based on their fit to the data, the above results can be used to
check the sensitivity of key inferences to alternative specifications of the variance-covariance
Co
structure. For instance, one could compare the mean and variance in the rate of change at age 13
obtained in Model 2 and Model 3 to assess how robust the results are to alternative plausible
covariance specifications. The mean rate, γ 10 , for Model 2 is 0.064704 (s.e. = 0.004926), and the
variance, τ 22 , is 0.00277 (s.e. = 0.00054). The mean rate, G10, for Model 3 is 0.060864 (s.e. =
0.004737), and the variance, τ 22 , is 0.00382 (s.e. = 0.00066). The results are basically similar. See
Raudenbush (2001) for a more detailed analysis of alternative covariance structures for polynomial
py
models of individual growth and change using the same NYS data sets employed here for the
illustrations.
Output for Random Effects Model for Log-linear model for Level-1 Variance
rig
Summary of the model specified
Level-1 Model
Level-2 Model
Var (Y * ) = ATA' + Σ
where Σ = diag (σ t2 ) , and
log(σ t2 ) = α 0 + α1 ( EXPO )t .
I
167
A
IND1 1.00000 -2.00000 4.00000
IND2 1.00000 -1.00000 1.00000
IND3 1.00000 0.00000 0.00000
IND4 1.00000 1.00000 1.00000
IND5 1.00000 2.00000 4.00000
τ
INTRCPT1,r0 0.04255 0.00831 -0.00257
AGE13,r1 0.00831 0.00277 -0.00005
AGE13S,r2 -0.00257 -0.00005 0.00051
ht
Standard errors of τ
INTRCPT1,r0 0.00517 0.00128 0.00089
AGE13,r1 0.00128 0.00054 0.00025
AGE13S,r2 0.00089 0.00025 0.00025
τ (as correlations)
©
INTRCPT1,r0 1.000 0.766 -0.549
AGE13,r1 0.766 1.000 -0.042
AGE13S,r2 -0.549 -0.042 1.000
Δ
IND1 0.03576 0.01267 0.01566 0.01782 0.01917
SS
The 5 × 5 matrix above contains the variance and covariance estimates implied by the "log-linear"
I
168
Δ (as correlations)
IND1 1.000 0.297 0.320 0.335 0.329
IND2 0.297 1.000 0.543 0.554 0.498
IND3 0.320 0.543 1.000 0.665 0.614
IND4 0.335 0.554 0.665 1.000 0.714
IND5 0.329 0.498 0.614 0.714 1.000
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
INTRCPT2, β00 0.328946 0.015379 21.390 238 <0.001
For AGE13 slope, π1
INTRCPT2, β10 0.064661 0.004923 13.135 238 <0.001
py
For AGE13S slope, π2
INTRCPT2, β20 -0.000535 0.003222 -0.166 238 0.869
Deviance = -349.916489
rig
Number of estimated parameters = 11
There are 3 fixed effects (f = 3), the dimension of τ is 3 (r = 3), and there is 1 intercept and 1
explanatory (H = 1) variable. Thus, there are a total of f + r(r+1)/2 + 1 + H = 3 + 3(3+1)/2 + 1 + 1 =
11 parameters.
ht
Next are the results for the first-order auto-regressive model (Example: NYS4.MLM)
Output for Random Effects Model First-order Autoregressive Model for Level-1 Variance
Level-1 Model
©
ATTITmi = (IND1mi)*ATTIT1i* + (IND2mi)*ATTIT2i* + (IND3mi)*ATTIT3i* + (IND4mi)*ATTIT4i* +
(IND5mi)*ATTIT5i*
Level-2 Model
SS
Note that β1 and β 2 are specified as non-random due to the fact that the time-series is relatively
I
short and therefore the data do not allow the estimation of both random slopes and an
autocorrelation parameter.
169
The above equation, written with subscripts and Greek letters, is
Var (Y * ) = ATA' + Σ
where
′
Σ = σ 2 ρ |t-t |.
Co
A
IND1 1.00000
IND2 1.00000
IND3 1.00000
IND4 1.00000
IND5 1.00000
py
Iterations stopped due to small change in likelihood function
Note that the maximum-likelihood estimate of ρ̂ = 0.397 is much larger than its standard error
(0.054), suggesting a significantly positive autocorrelation.
τ
ht
INTRCPT1,r0 0.02427
Standard error of τ
INTRCPT1,r0 0.00450
Δ
©
IND1 0.06585 0.04077 0.03081 0.02686 0.02530
IND2 0.04077 0.06585 0.04077 0.03081 0.02686
IND3 0.03081 0.04077 0.06585 0.04077 0.03081
IND4 0.02686 0.03081 0.04077 0.06585 0.04077
IND5 0.02530 0.02686 0.03081 0.04077 0.06585
The 5 × 5 matrix above contains the variance and covariance estimates implied by the "auto-
SS
Δ (as correlations)
IND1 1.000 0.619 0.468 0.408 0.384
IND2 0.619 1.000 0.619 0.468 0.408
IND3 0.468 0.619 1.000 0.619 0.468
I
170
Final estimation of fixed effects:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
INTRCPT2, β00 0.327579 0.015265 21.459 238 <0.001
For AGE13 slope, π1
INTRCPT2, β10 0.061428 0.004836 12.703 1076 <0.001
For AGE13S slope, π2
INTRCPT2, β20 0.000211 0.003373 0.062 1076 0.951
Co
Statistics for the current model
Deviance = -294.319916
Number of estimated parameters = 6
π0ij = β00j
π1ij = β10j
ht
Level-3 Model
Var(εij) = Δ
Δ(0)
©
IND1 0.04268 0.01233 0.01919 0.01968 0.01506 0.00898
IND2 0.01233 0.60634 0.35457 0.42101 0.31132 0.24927
IND3 0.01919 0.35457 0.76957 0.62363 0.42394 0.35205
IND4 0.01968 0.42101 0.62363 1.15453 0.67302 0.52773
IND5 0.01506 0.31132 0.42394 0.67302 0.81870 0.55086
IND6 0.00898 0.24927 0.35205 0.52773 0.55086 0.65701
SS
τβ(0)
INTRCPT1 YEAR
INTRCPT2 ,β00 INTRCPT2 ,β10
0.20128 0.01542
0.01542 0.01608
I
172
The value of the log-likelihood function at iteration 5 = -8.097070E+003
Δ
IND1 0.67340 0.31616 0.38755 0.52412 0.53030 0.38971
Co
IND2 0.31616 0.77832 0.47127 0.56726 0.54171 0.50187
IND3 0.38755 0.47127 0.91072 0.76829 0.66199 0.64640
IND4 0.52412 0.56726 0.76829 1.24542 0.88364 0.81782
IND5 0.53030 0.54171 0.66199 0.88364 1.05646 0.84356
IND6 0.38971 0.50187 0.64640 0.81782 0.84356 0.98722
py
Standard errors of Δ
IND1 0.08003 0.05328 0.07256 0.02999 0.02811 0.04341
IND2 0.05328 0.05757 0.06998 0.02542 0.03289 0.03656
IND3 0.07256 0.06998 0.07252 0.02966 0.03284 0.03565
IND4 0.02999 0.02542 0.02966 0.02844 0.03044 0.03913
IND5 0.02811 0.03289 0.03284 0.03044 0.03030 0.03518
rig
IND6 0.04341 0.03656 0.03565 0.03913 0.03518 0.03859
Δ (as correlations)
IND1 ,π0 1.000 0.437 0.495 0.572 0.629 0.478
IND2 ,π1 0.437 1.000 0.560 0.576 0.597 0.573
IND3 ,π2 0.495 0.560 1.000 0.721 0.675 0.682
ht
IND4 ,π3 0.572 0.576 0.721 1.000 0.770 0.738
IND5 ,π4 0.629 0.597 0.675 0.770 1.000 0.826
IND6 ,π5 0.478 0.573 0.682 0.738 0.826 1.000
τβ
INTRCPT1 YEAR
©
INTRCPT2 ,β00 INTRCPT2 ,β10
0.14824 0.01268
0.01268 0.00935
Standard Errors of τβ
INTRCPT1 YEAR
INTRCPT2 ,β00 INTRCPT2 ,β10
SS
0.03286 0.00626
0.00626 0.00218
τβ (as correlations)
INTRCPT1/INTRCPT2,β00 1.000 0.341
YEAR/INTRCPT2,β10 0.341 1.000
I
173
Final estimation of fixed effects:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3 ,γ000 -0.824938 0.054960 -15.010 59 <0.001
For YEAR slope, π1
For INTRCPT2, β10
INTRCPT3 ,γ100 0.755026 0.014229 53.062 59 <0.001
Co
Statistics for the current model
Deviance = 15960.507331
Number of estimated parameters = 26
Level-1 Model
Level-2 Model
174
Final Results - Iteration 5
τπ
INTRCPT1,r0 0.64046 0.04679
YEAR,r1 0.04679 0.01126
Co
Standard errors of τπ
INTRCPT1,r0 0.02515 0.00499
YEAR,r1 0.00499 0.00197
τπ (as correlations)
INTRCPT1,r0 1.000 0.551
YEAR,r1 0.551 1.000
py
Δ
IND1 0.77832 0.49553 0.51417 0.53282 0.55146 0.57011
IND2 0.49553 0.82687 0.55533 0.58523 0.61513 0.64503
IND3 0.51417 0.55533 0.89793 0.63765 0.67880 0.71996
rig
IND4 0.53282 0.58523 0.63765 0.99150 0.74247 0.79489
IND5 0.55146 0.61513 0.67880 0.74247 1.10758 0.86981
IND6 0.57011 0.64503 0.71996 0.79489 0.86981 1.24618
Δ (as correlations)
IND1 ,π0 1.000 0.618 0.615 0.607 0.594 0.579
ht
IND2 ,π1 0.618 1.000 0.644 0.646 0.643 0.635
IND3 ,π2 0.615 0.644 1.000 0.676 0.681 0.681
IND4 ,π3 0.607 0.646 0.676 1.000 0.709 0.715
IND5 ,π4 0.594 0.643 0.681 0.709 1.000 0.740
IND6 ,π5 0.579 0.635 0.681 0.715 0.740 1.000
τβ
©
INTRCPT1 YEAR
INTRCPT2 ,β00 INTRCPT2 ,β10
0.16532 0.01705
0.01705 0.01102
Standard Errors of τβ
SS
INTRCPT1 YEAR
INTRCPT2 ,β00 INTRCPT2 ,β10
0.03641 0.00720
0.00720 0.00252
τβ (as correlations)
I
175
Final estimation of fixed effects:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3 ,γ000 -0.779305 0.057829 -13.476 59 <0.001
For YEAR slope, π1
For INTRCPT2, β10
INTRCPT3 ,γ100 0.763028 0.015262 49.996 59 <0.001
Co
Statistics for the current model
Deviance = 16326.231108
Number of estimated parameters = 9
Level-1 Model
Level-2 Model
176
Final Results - Iteration 7
Standard
σ2
Error
IND1 0.34891 0.059597
IND2 0.38314 0.020556
IND3 0.31846 0.014915
IND4 0.37849 0.015840
IND5 0.20344 0.011466
Co
IND6 0.15546 0.014216
τπ
INTRCPT1,r0 0.62722 0.04769
YEAR,r1 0.04769 0.01386
Standard errors of τπ
py
INTRCPT1,r0 0.02499 0.00495
YEAR,r1 0.00495 0.00205
τπ (as correlations)
INTRCPT1,r0 1.000 0.511
rig
YEAR,r1 0.511 1.000
Δ
IND1 0.82432 0.48844 0.50148 0.51451 0.52755 0.54058
IND2 0.48844 0.89848 0.54224 0.56913 0.59603 0.62293
IND3 0.50148 0.54224 0.90146 0.62376 0.66452 0.70528
ht
IND4 0.51451 0.56913 0.62376 1.05687 0.73300 0.78762
IND5 0.52755 0.59603 0.66452 0.73300 1.00493 0.86997
IND6 0.54058 0.62293 0.70528 0.78762 0.86997 1.10778
Δ (as correlations)
IND1 ,π0 1.000 0.568 0.582 0.551 0.580 0.566
IND2 ,π1 0.568 1.000 0.603 0.584 0.627 0.624
©
IND3 ,π2 0.582 0.603 1.000 0.639 0.698 0.706
IND4 ,π3 0.551 0.584 0.639 1.000 0.711 0.728
IND5 ,π4 0.580 0.627 0.698 0.711 1.000 0.825
IND6 ,π5 0.566 0.624 0.706 0.728 0.825 1.000
τβ
SS
INTRCPT1 YEAR
INTRCPT2 ,β00 INTRCPT2 ,β10
0.16531 0.01552
0.01552 0.00971
Standard Errors of τβ
I
INTRCPT1 YEAR
INTRCPT2 ,β00 INTRCPT2 ,β10
0.03637 0.00677
0.00677 0.00225
177
τβ (as correlations)
INTRCPT1/INTRCPT2,β00 1.000 0.387
YEAR/INTRCPT2,β10 0.387 1.000
Standard Approx.
Co
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
For INTRCPT2, β00
INTRCPT3 ,γ000 -0.781960 0.057792 -13.531 59 <0.001
For YEAR slope, π1
For INTRCPT2, β10
INTRCPT3 ,γ100 0.751231 0.014452 51.983 59 <0.001
py
Statistics for the current model
Deviance = 16140.158919
Number of estimated parameters = 14
178
11 Special Features
Treating these coefficients as latent variables, the HLM2, HLM3, HMLM, HMLM2 modules allow
py
researchers to study direct as well as indirect effects among them and to assess their impacts on
coefficients associated with observed covariates in the model. Furthermore, using HMLM with
unrestricted covariance structures, one may use latent variable analysis to run regressions with
missing data.
rig
Below are two examples of latent variable analysis via Windows mode. See Appendix F for batch
and interactive modes.
We use π 0 , the level of tolerance at age 11, to predict π 1 , the linear growth rate, controlling for
©
gender. Note that FEMALE must be in the model for both π 0 and π 1 to control for gender fully. Note
also that π 0 and π 1 are latent variables, that is, they are free of measurement error, which is
contained in e. Furthermore, we assess whether the effect of gender on the linear growth rate may
change after controlling for the initial status at age 11. We select the homogeneous level-1 variance
option for this model. Thus, using HLM2 will yield identical results in this case.
SS
179
Co
py
rig
ht
3. Select the predictor(s) and outcome(s) by clicking the selection buttons in front of
them (for our example, select INTRCPT1, π 0 , as the predictor and AGE11, π 1 , as the
outcome).
180
Co
py
rig
Figure 11.2 Latent Variable Regression dialog box for the NYS example
Standard Approx.
ht
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
INTRCPT2, β00 0.221755 0.015961 13.894 237 <0.001
FEMALE, β01 -0.048274 0.022926 -2.106 237 0.035
For AGE11 slope, π1
INTRCPT2, β10 0.070432 0.006781 10.386 237 <0.001
FEMALE, β11 -0.012003 0.009826 -1.222 237 0.222
©
The results indicate that there is a significant linear growth rate in the attitude toward deviant
behaviors (coefficient = 0.070432, s.e. = 0.006781) for males. Also, there is no gender effect on the
linear growth rate.
181
Estimated Standard
Outcome Predictor t-ratio p-value
Coefficient Error
AGE11,r1,π1 INTRCPT2 ,β10* 0.024765 0.024807 0.998 0.319
FEMALE ,β11* -0.002062 0.013058 -0.158 0.875
π0,β12* 0.205934 0.105410 1.954 0.050
The results indicate that, controlling for gender, the initial status at age 11 has a marginally
significant effect on the linear growth rate (coefficient = 0.205934, s.e. = 0.105410). There is no
Co
statistically significant partial gender effect, however. Indeed, the gender effect on π 1 appears
somewhat reduced after controlling π 0 .
Standard
Original Adjusted
Outcome Predictor Difference Error of
py
Coefficient Coefficient
Difference
AGE11,r1,π1 INTRCPT2 0.07043 0.02477 0.045667 0.024311
FEMALE -0.01200 -0.00206 -0.009941 0.006941
This table lists the original coefficients, the adjusted coefficients, and the difference between the two
rig
for the intercept and the gender effect. For the variable FEMALE, the "original coefficient" describes
the total association, the "adjusted coefficient" describes the direct association, and the "difference"
is the indirect association between gender and the linear growth rate, respectively.
*
Var(r )
AGE11,r1 0.00196
ht
An estimate of the variance of r1* , the residual variance in π 1 , controlling both FEMALE and π 0 , is
also given.
As mentioned earlier, a latent variable analysis using HLM2 will reproduce identical results. The
©
same procedures generalize to three-level applications (HMLM2, HLM3, & HGLM) to model randomly
varying level-2 coefficients as outcome variables. See Raudenbush and Sampson (1999) for an
example that implemented a latent variable analysis with a three-level model. In the study, they
investigated the extent to which neighborhood social control mediated the association between
neighborhood social composition and violence in Chicago.
SS
two measures. To use HMLM to run regression with missing data, we first re-organize the data and re-
conceive the three measures for each participant j as "occasions of measurement. " If the data are
complete, each case has R = 3 occasions. If participant j is missing one value, there will only be 2
occasions for that participant, and if participant j is missing 2 values, there will be only 1 occasion
for that case. The measure is then re-conceived as MEASUREij , that is, the value of the datum
182
collected at occasion i for participant j, with i = 1, 2,..., n j , and with n j ≤ R = 3 . If the data are
complete for participant j, then:
MEASURE1 j = OUTCOME j ,
MEASURE 2 j = PRED1 j ,
MEASURE3 j = PRED2 j .
Co
Three indicators IND1 j , IND2 j , and IND3 j indicating whether MEASUREi j is OUTCOME j , PRED1 j ,
or PRED2 j are added to the data set.
Data for the first three participants are shown in Fig. 11.3.
py
rig
Figure 11.3 First three participants for Example 2
Note that Participant 1 has complete data, Participant 2 has data on PRED1 and PRED2 but not the
outcome, and the Participant 3 has data only on OUTCOME.
ht
Data on the measures and the three indicators constitute the level-1 data file, MISSING1.SAV, for the
example. The level-2 file, MISSING2.SAV, contains a dummy variable, DUMMY, which is not to be
used in the analysis. A MDM file, MISSING.MDM, is created. Figure 11.4 displays the model specified
with unrestricted covariance structure for the missing data example. The file that contains the file
specification information is MISSING1.MLM.
©
To regress OUTCOME (IND1) on PRED1 (IND2) and PRED2 (IND3), select IND1 as the outcome and
IND2 and IND3 as predictors in the Latent Variable Regression dialog box.
SS
I
183
Co
py
Figure 11.4 Model window for the missing data example
The following selected output (example MISSING1.MLM) gives the latent variable regression results.
rig
Latent Variable Regression Results
Standard
Original Adjusted
Outcome Predictor Difference Error of
Coefficient Coefficient
Difference
IND1 ,π1 INTRCPT2 52.25565 -23.96616 76.221809 14.285875
SS
*
Var(r )
IND1 33.51133
The results indicate that π 2 (associated with IND2) and π 3 (associated with IND3) have statistically
significant effects on IND1 (OUTCOME)4.
I
4
Raudenbush and Bryk (Hierarchical Linear Models, 2002) have shown that using this approach
with complete data replicated the results of SPSS regression analysis for the regression coefficients. As
HMLM adopts the full maximum likelihood estimation approach and the SPSS uses the restricted
maximum likelihood approach, the two sets of standard errors estimated differ by a factor of
184
11.2 Applying HLM to multiply-imputed data
A satisfactory solution to the missing data problem involves multiple, model-based imputation
(Rubin, 1987, Little & Rubin, 1987, Schafer, 1997). A multiple imputation procedure produces M
"complete" data sets. Users can apply HLM2 and HLM3 to these multiply-imputed data to produce
appropriate estimates that incorporate the uncertainty resulting from imputation.
There can be multiply-imputed values for the outcome or one covariate, or for the outcome and/or
Co
covariates.
HLM has two methods to analyze multiply-imputed data. They both use the same equations to
compute the averages, so the method chosen depends on the data you are analyzing.
"Plausible Values" as described in Sections 11.2.1 and 11.2.3. This method is usually preferable for
py
data sets that have only one variable (outcome or predictor) for which you have several plausible
values. In this case, you need to make one MDM file containing all of the plausible values, plus any
other variables of interest.
"Multiple Imputation" as described in Section 11.2.4. This method is necessary if you have more
than one variable for which you have multiply-imputed data. This method also requires a different
rig
way of setting up MDM files. Here, you have to create as many MDMs as you have plausible vales.
When making these MDMs, you should use the same level-2 file (and level-3 file if using HLM3), but
several level-1 files are needed.
Those variables that are not multiply imputed should be the same in all these level-1 files. The
ht
variables that are multiply imputed should be separated into the separate level-1 files, but they must
have the same variable names across these level-1 files, since the same model is run on each of these
MDMs.
11.2.1 Data with multiply-imputed values for the outcome or one covariate
©
HLM2 and HLM3 enable users to produce correct HLM estimates when using data sets that contain two
or more values or plausible values for the outcome variable or one covariate. One such data set is the
National Assessment of Educational Progress (NAEP), an U.S. Department of Education
achievement test given to a national sample of fourth, eighth, and twelfth graders.
Due to the use of balanced incomplete block (BIB) spiraling in the administration of the NAEP
SS
assessment battery, special procedures and calculations are necessary when estimating any
population parameters and their standard errors with data sets such as NAEP. Every student was not
tested on the same items, so item response theory (IRT) was used to estimate proficiency scores for
each individual student. This procedure estimated a range or distribution of plausible values for each
student's proficiency rather than an individual observed score. NAEP drew five plausible values at
I
J
J - Q - 1 , where in this case J = 15 and Q = 2.
185
random from the conditional distribution of proficiency scores for each student. The measurement
error is due to the fact that these scores are estimated, rather than observed.
In general, these plausible values are used to produce parameter estimates in the following way.
• Each parameter is estimated for each of the five plausible values, and the five estimates are
averaged.
• Then, the standard error for this average estimate is calculated using the approach
Co
recommended by Little & Schenker (1995).
• This formula essentially combines the average of the sampling error from the five estimates
with the variance between the five estimates multiplied with a factor related to the number of
plausible values. The result is the measurement error.
In an HLM analysis, with either two- or three-levels, the parameter estimates are based on the average
py
parameter estimates from separate HLM analyses of the five plausible values. That is, a separate HLM
analysis is conducted on each of the five plausible values.
Without HLM, these procedures could be performed by producing HLM estimates for each plausible
value, and then averaging the estimates and calculating the standard errors using another computer
program. These procedures are tedious and time-consuming, especially when performed on many
rig
models, grades, and dependent variables.
HLM takes the plausible values into account in generating the HLM estimates. For each HLM model,
the program runs each of the five (or the number specified) plausible values internally, and produces
their average value and the correct standard errors. There will seem to be one estimate, but the five
ht
HLM estimates from the five plausible values are produced and their average and measurement error
calculated correctly, thus ensuring an accurate treatment of plausible value data. The output is
similar to the standard HLM program output, except that all the components are averaged over
estimates derived from the five plausible values. In addition, the output from the five plausible value
runs is available in a separate output file.
©
11.2.2 Calculations performed
The program conducts a separate HLM analysis for each plausible value. The output of the separate
HLM analyses is written to files with consecutive numbers, for example, OUT.1, OUT.2, OUT.3, etc.
Then, HLM calculates the average of the parameter estimates from the separate analyses and
computes the standard errors. The output of the average HLM parameter estimates and their standard
SS
errors is found in the output file with the extension AVG.
• The reliabilities
• The parameter variances (tau) and its correlations
• The chi-square values to test whether the parameter variance is zero
• The standard errors for the variance-covariance components (full maximum likelihood
186
estimates)
• Multivariate hypothesis testing for fixed effects
The standard error of the averaged fixed effects (gammas) is estimated as described below. The
Student's t-value is calculated by dividing the average gamma by its standard error, and the
probability of the t-value is estimated from a standard t-distribution table.
Co
The standard error of the gammas consists of two components – sampling error and measurement
error. The following routine provided in the NAEP Data Files User Guide (Rogers, et al., 1992) is
used to approximate the component of error variance due to the error in imputations and to add it to
the sampling error.
py
Let θ m (m = 1,..., M ) represent the m-th plausible value. Let t m represent the parameter estimate
based on the m-th plausible value. Let U represent the estimated variance of t m .
m
• Five HLM runs were conducted based on each plausible value θ m . The parameter estimates
rig
from these runs were averaged:
t m
M
t *
= m =1
(11.1)
M
ht
• The variances of the parameters from these runs were averaged:
M
Um
U *
= m =1
(11.2)
M
(t )
M 2
m =1
m − t*
Bm = (11.3)
( M − 1)
SS
• The final estimate of the variance of the parameter estimate is the sum of the two
components:
V = U * + (1 + M −1 ) Bm (11.4)
where the degrees of freedom is computed:
I
d . f . = ( M − 1)(1 + r ) 2 ,
187
where
U*
r= .
1
B 1 +
M
The square root of this variance is the standard error of the gamma, and it is used in a standard
Co
Student's t formula to evaluate the statistical significance of each gamma.
188
11.2.4 Data with multiply-imputed values for the outcome and covariates
There may be multiply-imputed values for both the outcome and the covariates. To apply HLM to
such data, it is necessary to prepare as many MDM files as the number of imputed data sets. Thus, if
there are five imputed data sets, five MDM files with identical variable labels need to be prepared. To
run these models in batch mode, refer to Section F.3 in Appendix F.
Below are the commands for running an analysis with multiply-imputed data sets via Windows
Co
mode.
1. After specifying the model, select the Estimation Settings option from the Other Settings
menu.
py
2. Choose Multiple Imputation to open the Multiple Imputation MDM files dialog box (See
Figure 11.6 for an example).
3. Enter the names of the MDM files that contain the multiply-imputed data either by typing into
the File # edit boxes or clicking Browse to open them.
4. Click OK. Model specification follows the usual format.
rig
The calculations involved are very similar to the ones mentioned in Section 11.2.2.
ht
©
SS
The V-known option in HLM2 is a general routine that can be used for applications where the level-1
variances (and covariances) are known. Included here are problems of meta-analysis (or research
synthesis) and a wide range of other possible uses as discussed in Chapter 7 of Hierarchical Linear
Models. The program input consists of Q random level-1 statistics for each group and their
associated error variances and covariances.
189
We illustrate the use of the program with the following data from the meta-analysis of teacher
expectancy effects described on pp. 210-216 of Hierarchical Linear Models. Here we show the
process of V-known analysis in its most generic form, which requires using the interactive mode.
See Section 11.3.4 for an easier alternative method for Q = 1 using the Windows interface.
of Table 7.1).
The Q statistics, their error variances and covariances, and the level-2 predictors must be ordered as
described above and have a numeric format.
I
190
We present below an example of an HLM2 session that creates a multivariate data matrix file using
the V-known routine on the teacher expectancy effects data.
C:\HLM>HLM2
The file, EXPECT.DAT, contains the input data displayed above and the resulting multivariate data
matrix are saved in the EXPECT.MDM file. Note that the input format has been specified for the
character ID, the level-1 statistic (EFFSIZE), the associated variance, and the level-2 predictor
(WEEKS).
rig
11.3.3 Estimating a V-known model
Once the MDM file has been created, it can be used to specify and estimate a variety of models as in
any other HLM2 application. The example below illustrates interactive use of the V-known program
ht
(example EXPECT.HLM).
C:\HLM>hlm2 expect.MDM
191
ADDITIONAL PROGRAM FEATURES
OUTPUT SPECIFICATION
Do you want a residual file? n
How many iterations do you want to do? 10000
Do you want to see OLS estimates for all of the level-2 units? n
py
Enter a problem title: Teacher expectancy meta-analysis
Enter name of output file: expect.lis
-------------------------------------------------------
Level-1 Level-2
Effects Predictors
©
--------------------------------------------------------
EFFSIZE, B1 INTRCPT2, G10
WEEKS, G11
Tau dimensions
EFFSIZE slope
Level-1 Model
Y1 = B1 + E1
192
Level-2 Model
B1 = G10 + G11*(WEEKS) + U1
STARTING VALUES
Tau(0)
EFFSIZE,B(null) 0.02004
Tau
EFFSIZE,B 0.00001
---------------------------------------------------------------
Random level-1 coefficient Reliability estimate
---------------------------------------------------------------
EFFSIZE, B 0.000
---------------------------------------------------------------
SS
For EFFSIZE, B1
INTRCPT2, G10 0.408572 0.087146 4.688 17 0.000
WEEKS, G11 -0.157963 0.035943 -4.395 17 0.000
----------------------------------------------------------------------------------------------
193
Final estimation of variance components:
----------------------------------------------------------------------------------------------------
Random Effect Standard Variance df Chi-square P-value
Deviation Component
----------------------------------------------------------------------------------------------------
EFFSIZE, U 0.00283 0.00001 17 16.53614 >.500
----------------------------------------------------------------------------------------------------
In general, the HLM2 results for this example closely approximate the more traditional results that
would be obtained from a graphical examination of the likelihood function. (For this particular
model, the likelihood mode is at zero.) Note that the value of the likelihood was still changing after
7850 iterations. Often, HLM2 converges after a relatively small number of iterations. When the
py
number of iterations required is large, as in this case, this indicates that the estimation is moving
toward a boundary condition (in this example it is a variance estimate of zero for Tau). This can be
seen by comparing the starting value estimate for Tau, 0.02004, with the final estimate of 0.00001.
(For a further discussion see p. 202 of Hierarchical Linear Models.)
rig
11.3.4 V-known analyses where Q = 1
There is an alternative and appealing method for analysis for V-known analyses when Q =1. This may
be accomplished as follows:
1. Select the Estimation Settings option from the Other Settings menu.
ht
2. Use the pull down menus to select the variable that represents the known level-1
variance.
This may be accomplished in either the two-level or the three-level HLM programs.
This example uses data collected by the Project of Human Development in Chicago Neighborhoods
(Sampson, Raudenbush, & Earl, 1997) on 7,729 residents living in 342 neighborhoods. It is an
unconditional model with a ten-item collective efficacy scale, defined as the fusion of social
cohesion and informal social control, as the outcome.
194
For spatial HLM2 models, the level-1 and level-2 models have the same structure as those described
in Section 2.5. These two data files for the example, linked by level-2 neighborhood cluster IDs, are
RESIDENT.SAV and NEIGHBOR.SAV. In the level-1 data file, there is one variable, collective efficacy
(COLLEFF). In the level-2 data file, a dummy variable is included. The spatial dependence analysis
requires another data file with information on spatial contiguity. The information allows the program
to create a spatial weight matrix, W, which is a binary contiguity matrix indicating that sites are
contiguous to each other. ROOK.SAV, contains such information for our illustrative example. The
Co
variables followed by the neighborhood cluster IDs are:
The data for the first ten neighborhoods are displayed in Fig 11.7. Note that neighborhood 1 (that is,
the neighborhood with ID = 1) shares a common boundary with two neighborhoods, specifically,
py
neighborhoods 2 and 3. In contrast, neighborhood 2 shares a boundary with 4 neighborhoods,
neighborhoods 8, 6, 3, and 1.
rig
ht
Figure 11.7 First ten cases in ROOK.SAV
The file SPATIAL.MDMT stores the commands for creating the two-level multivariate data matrix file,
SPATIAL.MDM. The procedure is very similar to those described in Section 2.5.1. An extra step
needed is to instruct the program to include spatial dependence information with the following
procedure:
©
SS
I
195
Co
py
rig
Figure 11.8 Make HLM – Dialog Box
The file SPATIAL.HLM contains the commands for setting up the unconditional model. The procedure
follows the steps outlined and illustrated in Section 2.5.2.5. An additional step is to instruct HLM2 to
run the model as a spatial dependence model by the following procedure:
©
1. Open the Other Settings menu and select the Estimation Settings.
2. Check the box for Run as spatial dependence model (see Figure 11.9).
SS
I
196
Co
py
Figure 11.9 Estimation Settings – HLM2
The model window for our illustrative example in Figure 11.10 gives the model specifications.
rig
ht
©
b0 = ρWb0 + u
A spatial dependence analysis using HLM2 provides two sets of results, one for regular HLM and the
other HLM with spatial dependence. A comparison test of the fit of these models is performed and the
result is given. Below is a partial output of the results of the unconditional model.
Co
Here are the partial results for the regular HLM:
σ2 = 0.42136
py
Standard error of σ2 = 0.00693
τ
INTRCPT1,β0 0.08904
Standard error of τ
INTRCPT1,β0 0.00850
rig
Random level-1 coefficient Reliability estimate
INTRCPT1,β0 0.799
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 3.433243 0.018056 190.142 341 <0.001
©
Final estimation of fixed effects
(with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 3.433243 0.018056 190.144 341 <0.001
SS
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1, u0 0.29839 0.08904 341 1870.37148 <0.001
level-1, r 0.64913 0.42136
I
198
Statistics for the current model
Deviance = 15823.710765
Number of estimated parameters = 3
σ2
Co
= 0.42149
τ
INTRCPT1,β 0.03477
ρ
INTRCPT1,β 0.81701
py
Final estimation of fixed effects:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, β0
INTRCPT2, γ00 3.404181 0.056443 60.312 341 <0.001
rig
Statistics for the current model
Deviance = 15671.980461
Number of estimated parameters = 4
Regular HLM vs. HLM with spatial dependence model comparison test
ht
χ2 statistic = 151.73030
Degrees of freedom = 1
p-value = <0.001
The average level-2 variance is the average of the neighborhood-specific variance. These depend on
τ , but also on the magnitude of the spatial dependence correlation, ρ , and the configuration of
neighborhoods near that neighborhood. The average level-2 covariance is the average covariance
between pairs of contiguous neighborhoods.
SS
• The result of the comparison test provides evidence that the HLM with spatial dependence
provides a better fit, as indicated by the χ2 statistic of 151.73, df = 1, p < .001.
I
• A comparison of the standard errors for γˆ00 the regular HLM and HLM with spatial dependence
(.018 vs .056) suggests that, given ρ̂ is equal to .8, that there is an underestimation of the
standard errors when spatial dependence is ignored.
199
Users can also obtain spatial empirical Bayes estimates of the neighborhood collective efficacy
measures by following the procedure as specified in Section 2.5.4.2. Figure 11.11 gives the ten
records of the residual file for the uncondtional model.
Co
py
Figure 11.11 Level-2 Residual File
U_INTRCP and B_INTRCP are the two Empirical Bayes for the regular HLM and the HLM with spatial
dependence. For a discussion of the properties of the empirical Bayes estimator that exploits spatial
dependence, see Verbitsky-Savitz and Raudenbush (2009).
rig
11.3.6 Other outcome variables
Spatial dependence models handles continuously distributed as well as discrete outcomes, including
binary outcomes, counted data, ordered categories, and multinomial outcomes.
ht
©
SS
I
200
12 Conceptual and Statistical Background for Cross-classified
Random Effect Models (HCM2)
Co
All of the applications discussed thus far have involved a strictly hierarchical data structure. Such
nesting structures would occur, for example, in a study of neighborhood and school effects on child
development in which all children living in the same neighborhood attended the same school, with
multiple neighborhoods per school. In this case we would have children at level 1 nested within
neighborhoods at level 2 and neighborhoods nested within schools at level 3. Alternatively, we
might have a nested structure in which every child attending a given school lived in the same
neighborhood, with multiple schools per neighborhood. In this case, we would have children nested
py
within schools nested within neighborhoods. HLM3 can be used to accommodate such three-level
nested data structures. However, we typically find, in fact, that children who reside in a specific
neighborhood can enroll in one of several schools, and each school might draw students from
several neighborhoods. In this case, the data gathered will no longer have a purely nested structure.
Instead, a cross-classification of students by two higher-level factors, neighborhoods and schools,
arises. To handle this more complex data structure while modeling the developmental influences of
rig
neighborhoods and schools requires the use of cross-classified random effects models (HCM2).
classified by neighborhoods and schools, a cell consists of a set of students who live in the same
neighborhood and attend the same school. The level-1 or within-cell model will represent the
relationships among the student-level variables for those students while the level-2 or between-cell
model will capture the influences of neighborhood- and school-level factors. Formally, there are
i = 1, 2,..., n jk level-1 units (e.g., students) nested within each cell cross-classified by j = 1,..., J units
201
of the first higher-level factor (e.g., neighborhoods), designated as rows, and k = 1,..., K units of the
second higher-level factor (e.g., schools), designated as columns. For a graphical representation of
this data layout in Garner and Raudenbush (1991), see Table 12.1 in Chapter 12 of Hierarchical
Linear Models.
In HLM7, HCM2 handles continuously distributed as well as discrete outcomes, including binary
outcomes, counted data, ordered categories, and multinomial outcomes. We use the continuous
outcome models in the following discussion. The logic of HGLM, as described and illustrated in
Co
Chapter 5, applies and extends to analyses with any of the four types of discrete outcomes with
HCM2.
Yi jk = π 0 j k + π 1 jk a1i jk + π 2 j k a2 i jk + + π p jk a pi jk + ei jk (12.1)
where
rig
π 0 jk is the intercept, the expected value of Yi jk within cell jk when all explanatory variables
are set to zero;
π p jk are the level-1 coefficients of predictors a pi jk , for p =1,…,P;
ei jk is the level-1 or within-cell random effect; and
ht
σ 2 is the variance of ei jk , that is the level-1 or within-cell variance. Here we assume that the
random term eijk ~ N (0, σ 2 ) .
(
π p jk = θ p 0 + ( β p1 + bp1 j ) X 1k + ( β p 2 + bp 2 j ) X 2 k + + β pQ + bpQ p p j )X Qp k +
(γ ( )
+ c p1k ) W1 j + ( γ p 2 + c p 2 k ) W2 j + + γ p Rp + c p Rp k WRp j + (12.2)
SS
p1
bp 0 j + c p 0 k
where
I
θ p 0 is the model intercept, the expected value of π p jk when all explanatory variables are set
to zero;
β p q are the fixed effects of column-specific predictors X q k , q = 1,..., Q p ;
bp q j are the random effects associated with column-specific predictors X q k . They vary
202
randomly over rows j = 1,..., J;
γ p r are the fixed effects of row-specific predictors Wr j , r = 1,..., R p ;
c p r k are the random effects associated with row-specific predictors Wr j . They vary randomly
over columns k = 1,…, K, and;
bp 0 j and c p 0 k are residual row and column random effects, respectively, on π p j k , after taking
into account X q k and Wr j . We assume that bp 0 j ~ N ( 0,τ pb 00 ) , c p 0 k ~ N ( 0,τ p c 00 ) , and
Co
that the effects are independent of each other.
The vector of random row effects bpqj (p = 0,…,P; q = 0,…,Qp ;) is assumed multivariate normal
with a mean zero and a full covariance matrix τ . Similarly the vector of random column effects c prk
(p = 0,…,P; r = 0,…,Rp ;) is assumed multivariate normal with mean vector zero and full covariance
matrix Δ .
py
12.2 Parameter estimation
For continuous outcomes, three kinds of parameter estimates are available in HCM2: empirical Bayes
estimates of random coefficients; maximum-likelihood estimates of the fixed regression coefficients;
and maximum likelihood estimate of the variance-covariance components. The estimation procedure
rig
uses a full maximum likelihood approach (Raudenbush, 1993).
For discrete outcomes, the parameter estimates of the fixed regression coefficients are based on the
method of penalized quasi-likelihood. Unlike HGLM, however, unit-specific but not population-
averaged results are available.
ht
12.3 Hypothesis testing
As in the case of HLM2, HCM2 routinely prints standard errors and t-tests for each of the fixed level-2
coefficients as well as a chi-square test of homogeneity for each random effect. In addition, optional
"multivariate hypothesis tests" are available in HCM2. Multivariate tests in the case of continuous
©
outcomes parallel those described in Section 2.8.8. For discrete outcomes, hypothesis testing
parallels those described in Section 5.10.
SS
I
203
13 Working with HCM2
Chapter 12 in Hierarchical Linear Models presents a series of analyses of data from a study of
neighborhood and school contribution to educational attainment in Scotland (Garner & Raudenbush,
py
1991). We use the data from the study, provided along with the HLM software, to illustrate the
operation of the HCM2 program.
Data input requires a level-1 file (student-level file), a level-2 row-factor (neighborhood-level) file,
and a level-2 column-factor (school-level) file.
©
Level-1 file. The level-1 or within-cell file, ATTAINW.SAV has 2,310 students and 8 variables. The
two IDs are NEIGHID for neighborhoods and SCHID for schools. The variables are:
otherwise)
• MALE, an indicator for student gender (1 if male, 0 if female)
Data for the first 15 observations are shown in Figure 13.1. Note that five students from
Neighborhood 26 and one from Neighborhood 27 attended School 20. These first six observations
204
provided information about two neighborhood-by-school combinations or cells. One of the next nine
students living in Neighborhood 29 attended School 18 and the other eight went to School 20. They
provided data for two cross-classified neighborhood-by-school cells (see Table 12.1 in Hierarchical
Linear Models, p. 374, for a display of the organization of the data by counts in each neighborhood-
by-school cell).
Co
py
Figure 13.1 First 16 cases in the ATTAINW.SAV dataset
rig
13.1.1.2 Level-2 row-factor file
For our neighborhood example, the level-2 row-factor (neighborhood) level file, ATTAINR.SAV,
consists data on 1 variable for 542 neighborhoods. The variable is DEPRIVE (a scale measuring
social deprivation, which incorporates information on the poverty concentration, health, and housing
ht
stock of a local community).
The level-2 column-factor (neighborhood) file, ATTAINCO.SAV, has 17 schools and 1 variable. The
I
variable is DUMMY, a dummy variable. Figure 13.3 shows data for the first 4 schools.
205
Co
Figure 13.3 First 4 cases in the ATTAINCO.SAV data set
The steps for the construction of the MDM for HCM2 are similar to the ones described earlier. Select
HCM2 in the Select MDM type dialog box (see Figure 2.5). Note that the program can handle
missing data at level 1 or within-cell only. The MDM template file, ATTAIN.MDMT, contains a log of
the input responses used to create the MDM file, ATTAIN.MDM, using ATTAINW.SAV, ATTAINR.SAV,
and ATTAINCO.SAV. Figure 13.4 displays the dialog box used to create the MDM file. Figures 13.5 to
py
13.7 show the dialog boxes for the within-cell file, ATTAINW.SAV, the row-factor file, ATTAINR.SAV,
and the column-factor file, ATTAINCO.SAV.
rig
ht
©
206
Co
py
Figure 13.5 Choose variables – HCM2 dialog box for level-1 or within-cell file,
ATTAINW.SAV
rig
ht
©
SS
Figure 13.6 Choose variables – HCM2 dialog box for level-1 or row-factor file,
ATTAINR.SAV
I
207
Co
py
Figure 13.7 Choose variables – HCM2 dialog box for level-1 or column-factor file,
ATTAINCO.SAV
rig
13.2 Executing analyses based on the MDM file
Once the MDM file is constructed, it can be used as input for the analysis. Model specification has
three steps:
ht
1. Specification of the level-1 or within-cell model. In our example, we shall model
educational attainment (ATTAIN) as the outcome. We first formulate an unconditional
model that includes no predictor variables at any level. In the second or conditional
model, we use prior measures of cognitive skill, verbal reasoning quotient and reading
achievement, father's employment status and occupation and father's and mother's
education to predict attainment.
©
2. Specification of the row- or column-factor prediction model. In the second or conditional
model, we shall predict each student's intercept with social deprivation.
3. Specification of the residual row, column, and cell-specific effects as random or non-
random, the effects associated with row-specific predictors as varying randomly or fixed
over columns, and the effects associated with column-specific predictors as varying
randomly or fixed over rows. We shall test whether the association between social
SS
deprivation (a row-specific predictor) and attainment varies over schools in the third
model.
Following the three steps above, we first specify a model with no student-, neighborhood-, or
school-level predictors. The purpose is to estimate the components of variation that lie between
neighborhoods, between schools, and within cells.
I
Level-1 Model
Level-2 Model
SS
For starting values, data from 2310 level-1, 524 row-level and 17 column-level records were used
I
209
Final Results - iteration 21
σ2 = 0.79909
τrows
INTRCPT1
ICPTROW,b00j
Co
0.14105
τcolumns
INTRCPT1
ICPTCOL,c00k
0.07546
py
The intra-neighborhood correlation, the correlation between outcomes of two students who live in
the same neighborhood but attend different schools, is estimated to be:
τ b 00
ˆ (Yi j k , Yi j k 'ε ) =
Corr
τ b 00 + τ c 00 + σˆ 2
rig
0.141
=
0.141 + 0.075 + 0.799
= 0.139.
τ c 00
ˆ (Yi j k , Yi ' j ' k ) =
Corr
©
τ b 00 + τ c 00 + σˆ 2
0.075
=
0.141 + 0.075 + 0.799
= 0.074.
SS
That is, about 7.4% of the variation lies within schools.
The intra-cell correlation is the correlation between outcomes of two students who live in the same
neighborhood and attend the same school:
τ b 00 + τ c 00
ˆ (Yi j k , Yi j k 'ε ) =
I
Corr
τ b 00 + τ c 00 + σ 2
0.141 + 0.075
=
0.141 + 0.075 + 0.799
= 0.212.
210
Thus, according to the fitted model, about 21% of the variance lies between cells.
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
Co
For INTRCPT1, π0
INTERCEPT,θ0 0.075357 0.072226 1.043 1769 0.297
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/ ICPTROW,b00j 0.37556 0.14105 523 904.83225 <0.001
py
level-1, e 0.89392 0.79909
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
rig
INTRCPT1/ ICPTCOL,c00k 0.27470 0.07546 16 120.45262 <0.001
Deviance = 6356.711470
Number of estimated parameters = 4
ht
13.3 Specification of a conditional model with the effect associated with
a row-specific predictor fixed
The above example involves a model that is unconditional at all levels. In this model we set up a
©
level-1 and a row-factor prediction model.
At the model specification dialog box, select P7VCR, P7READ, DADOCC, DADUNEMP, DADED,
MOMED, and MALE and grand-mean center all the predictors. Figure 13.9 shows the model with the
level-1 predictors. In the interest of parsimony, given the small cell sizes and within-neighborhood
SS
sizes, all level-1 coefficients are fixed. (To specify any of them as randomly varying, select the
equation containing a specific regression coefficient, π p , and click on bp 0 ).
I
211
Co
py
rig
Figure 13.9 Level-1 Prediction Model for the Attainment Study
Select the equation containing π 0 . A listbox for row-factor variables (>>Row<<) will appear. Click
ht
DEPRIVE and apply the grand-mean centering scheme. In the level-2 model, we treated the
association between social deprivation and educational attainment as fixed across all schools. We
relax this assumption in our next model. Figure 13.10 displays the conditional model. Note that c01 is
disabled.
©
SS
I
212
Co
py
rig
Figure 13.10 Conditional Model for the Attainment Study, with Social Deprivation Effect
Fixed
Level-1 Model
213
Level-2 Model
For starting values, data from 2310 level-1, 524 row-level and 17 column-level records were used
σ2 = 0.45891
τrows
rig
INTRCPT1
ICPTROW,b00j
0.00014
τcolumns
INTRCPT1
ht
ICPTCOL,c00k
0.00389
214
For DADUNEMP, π4
INTERCEPT,θ4 -0.120771 0.046779 -2.582 1769 0.010
For DADED, π5
INTERCEPT,θ5 0.144426 0.040782 3.541 1769 <0.001
For MOMED, π6
INTERCEPT,θ6 0.059440 0.037381 1.590 1769 0.112
For MALE, π7
INTERCEPT,θ7 -0.056058 0.028401 -1.974 1769 0.049
Co
Final estimation of row and level-1 variance components:
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/ ICPTROW,b00j 0.01184 0.00014 522 548.81015 0.202
level-1, e 0.67743 0.45891
py
Final estimation of column level variance components:
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/ ICPTCOL,c00k 0.06239 0.00389 15 36.38151 0.002
rig
Statistics for the current model
Deviance = 4769.604659
Number of estimated parameters = 12
215
Co
py
rig
Figure 13.11 Conditional Model for the Attainment Study, with Social Deprivation Effect
Random
To specify the effect of the row-specific predictor random, select the equation containing π 0 . Click
on c01 . Figure 13.11 displays the conditional model with the social deprivation effect specified as
ht
random. We compare the model deviance of this model against the one estimated in the last analysis.
The procedure is the same as described in Section 2.9.6.
τrows
INTRCPT1
ICPTROW,b00j
0.00371
SS
τcolumns
INTRCPT1 INTRCPT1
ICPTCOL,c00k DEPRIVE,c01k
0.00391 0.00159
0.00159 0.00067
I
The point estimate of the variance of the unique contribution of school k to the association between
social deprivation and attainment is .001 and that of the covariance between the effect with the
school random effect is .002.
216
τcolumns (as correlations)
INTRCPT1/ ICPTCOL,c00k 1.000 0.984
INTRCPT1/ DEPRIVE,c01k 0.984 1.000
Standard Approx.
Co
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
INTERCEPT,θ0 0.092434 0.021354 4.329 1752 <0.001
DEPRIVE, γ01 -0.159051 0.026763 -5.943 522 <0.001
For P7VRQ, π1
INTERCEPT,θ1 0.027636 0.002263 12.211 1752 <0.001
For P7READ, π2
INTERCEPT,θ2 0.026242 0.001750 14.992 1752 <0.001
py
For DADOCC, π3
INTERCEPT,θ3 0.008112 0.001360 5.964 1752 <0.001
For DADUNEMP, π4
INTERCEPT,θ4 -0.120306 0.046759 -2.573 1752 0.010
For DADED, π5
INTERCEPT,θ5 0.142622 0.040753 3.500 1752 <0.001
For MOMED, π6
rig
INTERCEPT,θ6 0.060870 0.037358 1.629 1752 0.103
For MALE, π7
INTERCEPT,θ7 -0.056139 0.028383 -1.978 1752 0.048
Standard Variance
ht
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/ ICPTROW,b00j 0.06087 0.00371 522 545.30137 0.232
level-1, e 0.67468 0.45519
Deviance = 4768.508277
SS
χ2 statistic = 1.09638
Degrees of freedom = 2
I
p-value = >.500
The result of the deviance test is not significant. There is no evidence that the association between
neighborhood social deprivation and attainment varies over schools. Not surprisingly, the standard
217
error for γ 01 , the social deprivation effect, remains nearly unchanged, as do all inferences about the
fixed effects.
The options are similar to the corresponding dialog box for HLM2 (see Section 2.5.2). Unlike HLM2,
the user has the option to create a level-1, row and column residual file. There is an option unique to
HCM2. When modeling longitudinal, repeated measures, it is possible to select a cumulative effect
model to allow carry-over treatment effects by specifying a cumulative Z-structure model. See
Hierarchical Linear Models, p. 390, for an example. HCM2 also allows users to diagonalize the τ s
SS
for rows and columns and weigh the cases within cells and rows (see Fig 13.13).
I
218
Co
Figure 13.13 The Estimation Settings – HCM2 dialog box
py
rig
ht
©
SS
I
219
14 Conceptual and Statistical Background for Three-Level
Hierarchical and Cross-classified Random Effects Models
(HCM3)
Co
The HCM2 models discussed in the previous chapters allow researchers to analyze data that display
structures in which the lower-units are cross-classified by two higher-level factors. Suppose,
however, that one of the higher-level factors is itself nested within a yet-higher level factor. The
three-level hierarchical and cross-classified random effects models (HCM3) represent this case,
where level-1 units are cross-classified by two higher-level factors, with units from one of the
higher-level factors nested within a next higher-level unit.
py
Hong and Raudenbush (2008) used three-level hierarchical and cross-classified random effects
models to investigate how schools and their teachers may contribute to student growth, taking into
account also the student-level variables. In their study, students were moving over time across
teachers and the teachers were nested within schools. We can say that the repeated measures (level-
rig
1) were cross-classified by students (rows) and teachers (columns) with teachers nested within
schools (clusters). The model is sufficiently flexible to allow the students also to change schools
over the course of the study. In general, we may say that level-1 observations are crossed by rows
and columns and the columns are nested within clusters.
are i=1,2,…, njkl level-1 units (e.g., repeated measurement of student achievement) nested within
cells cross-classified by j = 1,…, J rows (e.g., students) and k = 1,…., K columns, with columns
with cluster l = 1, …., L.
Here is an example of a data layout for three waves of developmental data (njkl = 3) for J = 4
students crossed by K = 9 teachers, with the teachers nested within L = 3 schools:
I
220
Table 14.1 Organization of data of the HCM3 example
Table 14.1 indicates that the repeated assessments are cross-classified by students and teachers, with
teachers clustered within schools. Student 1 stayed in school 1 over three years of observation,
changing teachers each year. Similarly Student 2 stayed in school 2 while changing teachers each
year. Student 3 stayed in the same school, but was not observed during year 2. Student 4 had all
py
three observations, but changed schools after year 1 and year 2.
HCM3 can handle continuously distributed as well as binary outcomes. We use the continuous
outcome models in the following discussion. The logic of HGLM, as described and illustrated in
Chapter 7, applies and extends to analyses with binary outcomes with HCM3.
rig
14.1.1 Level-1 or "within-cell" model
We represent in the level-1 or within-cell model the outcome for case i in individual cells cross-
classified by level-2 units j and k, with unit k nested within cluster l.
ht
Yijkl = π 0 jkl + π1 jkl a1ijkl + π 2 jkl a2ijkl + ⋅⋅⋅ + π pjkl a pijkl + eijkl
P (14.1)
= π 0 jkl + π pjkl a pijkl + eijkl
p =1
where
©
π 0 jkl is the intercept, the expected value of Yi jkl when all explanatory variables are set to zero;
π pjkl are level-1 coefficients of predictors a pjkl (p=1, 2, …, P) for case i in cell jkl;
eijkl is the level-1 or within-cell random effect, and;
σ 2 is the variance of eijkl , that is the level-1 or within-cell variance. Here we assume that the
random term eijkl ~ N (0,σ 2 ) .
SS
221
π pjkl = θ pl + ( β p1l + bp1 j ) X 1kl + ( β p 2l + bp 2 j ) X 2 kl + ⋅⋅⋅ + ( β pQ l + bpQ j ) X Q kl +
p p p
bp 0 j + c p 0 kl (14.2)
Qp Rp
where
Co
θ pl is the level-2 model intercept, the expected value of π p jkl when all explanatory variables are
set to zero;
β p ql are the level-2 coefficients of column-specific predictors X q kl , q = 1,..., Q p ,
bpqj are the random effects associated with column-specific predictors X q k . They vary randomly
over rows j = 1,..., J;
py
γ p rl are the level-2 coefficients of row-specific predictors Wrjl , r = 1,..., R p ;
c p r kl are the random effects associated with row-specific predictors Wrjl . They vary randomly
over columns k = 1,…, Kl and clusters l = 1,…, L; and
bp 0 j and c p 0 kl are residual row- and column-specific random effects, respectively, on π pjkl , after
taking into account X q kl and Wrjl .
rig
The vector of row random effects, containing b p 0 j , ,…, bPQj is assumed multivariate normal with a
mean zero and a full covariance matrix τ . Similarly the vector with elements c p 0 kl ,…, c PRkl is
assumed multivariate normal with mean vector zero and full covariance matrix Δ .
ht
14.1.3 Level-3 model
Each of the level-2 coefficients become an outcome variable at level 3:
©
θ pl = δ p 00 + (δ p 01 + bp 01 j ) Z1l + (δ p 02 + bp 02 j ) Z 2l + ⋅⋅⋅ + (δ p 0 S + bp 0 S j ) Z S
p0 p0 p 0l
+ d p 0l
Sp0
= δ p 00 + (δ p 0 s + bp 0 sj )Z sl + d p 0l
s =1
S pq (14.3)
= δ pq 0 + (δ pqs + bpqsj )Z sl + d pql
SS
s =1
S pr
where
δ p 00 is the intercept, the expected value of θ pl when all explanatory variables are set to zero;
δ p 0 s are the coefficients of cluster-specific predictors Z sl for θ pl ;
222
δ pq 0 is the intercept, the expected value of β pql when all explanatory variables are set to zero;
δ pqs are the coefficients of cluster-specific predictors Z sl , s = 1,..., S pq for β pql ;
bp q sj are the random effects associated with cluster-specific predictors Z sl . They vary randomly
over rows j = 1,..., J;
δ pr 0 is the intercept, the expected value of γ prl when all explanatory variables are set to zero;
δ prs are the coefficients of cluster-specific predictors Z sl for γ prl ;
Co
bp r sj are the random effects associated with cluster-specific predictors Z sl . They vary randomly
over rows j = 1,..., J ; and
d p 0l , d pql , and d prl are residual random effects. We assume these to be multivariate normal in
distribution with zero means and variances τ p 0 , τ pq , τ pr , respectively.
py
14.2 Parameter estimation
Three kinds of parameter estimates are available in HCM3. For continuous outcomes, empirical
Bayes estimates of random effects, maximum-likelihood estimates of the level-3 coefficients, and
maximum likelihood estimates of variance-covariance parameters are available. In nonlinear models,
the level-3 coefficients are estimated via penalized quasi-likelihood. Unlike HGLM, however, only
rig
unit-specific and not population-averaged results are available.
223
15 Working with HCM3
To illustrate the operation of the program, we use the data from Hong and Raudenbush's (2008)
study on the effects of time-varying instructional treatments (intensive vs. conventional math
py
instruction) on student achievement.
Note: The level-1 file is to be sorted on ascending row (student) IDs, and, in this file, sorting by
ht
column IDs within clusters. The level-2 row file is to be sorted on ascending row (student) IDs. The
level-2 column file is to sorted by column IDs within clusters. The cluster file is to be sorted by
cluster IDs.
• MATH
A Stanford Achievement Test math test score.
I
224
• G4D1 is an indicator that that takes on a value of 1 if a child receives intensive math
instruction in grade 4 and if the outcome is observed at grade 4. This will be used to
assess the effect of grade-4 intensive math instruction on grade-4 outcome.
• G4D21 is an indicator that a child receives intensive math instruction in grade 4 and if
the outcome is observed at grade 5. This will be used to test the effect of grade-4
intensive math instruction on grade-5 outcome for those who do not receive intensive
math instruction in grade 5.
• G5D22
Co
An indicator that a child receives intensive math instruction in grade 5 and if the
outcome is observed at grade 5. This will be used to test the effect of intensive math
instruction in grade 5 on grade 5 outcome for those who did not have intensive math
instruction in grade 4.
• TWOWAY
A product term of a two-way interaction between G4D21 with G5D22. It will thus be an
py
indicator that the child received intensive math instruction in both grades 4 and 5 and if
the outcome is observed at grade 5. This will test whether having intensive math
instruction in both years has an effect that exceeds the sum of the separate effects.
rig
ht
Level-2 row-factor file. The level-2 row-factor units in the illustration are 4216 students. The data are
SS
stored in the file STUDENT.SAV. The level-2 data for the first ten children are listed in Figure 15.2.
The file has one dummy variable.
I
225
Co
Figure 15.2 First 10 cases in the STUDENT.SAV dataset
Level-2 column-factor file. The level-2 column-factor (teacher) file, TEACHER.SAV, has two IDs and a
py
dummy variable. The first ID is the level-3 (i.e., school) ID and the second ID is the level-2 column
factor (i.e., teacher) ID. Figure 15.3 lists the data for the first ten records.
rig
ht
Level-3 file. The level-3 (school) level file, SCHOOL.SAV, has the level-3 (school) ID and a dummy
©
variable. Figure 15.4 lists the data for the first ten records.
SS
I
226
In sum, there are six variables at level 1 and one dummy variable for each of the level-2 row- and
column-factor files and the level-3 file. The steps for the construction of the MDM for HCM3 are
similar to the ones described in Section 2.5.1.1. The user will select HCM3 in the Select MDM type
dialog box (see Figure 2.5). Note that the program can handle missing data at level 1 or within cell
only. The MDM template file, GROWTH.MDMT, contains a log of the input responses used to create
the MDM file, GROWTH.MDM, using GROWTH.SAV, STUDENT.SAV, TEACHER.SAV, and
SCHOOL.SAV. Figure 15.5 displays the dialog box used to create the MDM file. Figures 15.6 show the
dialog boxes for the level-1 file.
Co
py
rig
ht
Figure 15.5 Make MDM - HCM3 dialog box for GROWTH.MDMT
©
SS
I
Figure 15.6 Choose variables - HCM3 dialog box for level-1 file, GROWTH.SAV
227
15.2 Executing analyses based on the MDM file
Once the MDM file is constructed, it can be used as input for the analysis. Model specification has
five steps:
Following the five steps above, we specify a model to study the effects of time-varying instructional
treatments on student achievement. The Windows execution is very similar to the one for HCM2 as
ht
described in Section 13.4. The command file, GROWTH1.MLM, contains the model specification
input responses. Figure 15.7 displays the model specified.
©
SS
I
228
Intercept
Co Outcome YEAR slope
229
Summary of the model specified
Level-1 Model
Level-2 Model
τπ
YEAR
θ0,b00 θ1,b10
769.17514 -18.09880
-18.09880 21.22623
©
τπ (as correlations)
1.000 -0.142
-0.142 1.000
τβ
SS
YEAR
θ0,c00 θ1,c10
133.52764 -24.04565
-24.04565 48.79836
I
230
τβ (as correlations)
1.000 -0.298
-0.298 1.000
τγ
YEAR
Co
θ0,d00 θ1,d10
169.31794 28.10279
28.10279 29.76755
τγ (as correlations)
1.000 0.396
0.396 1.000
py
The value of the log-likelihood function at iteration 485 = -3.536565E+004
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
rig
error d.f.
For INTRCPT1
For INTERCEPT
θ0,δ000 609.850986 1.962504 310.751 66 <0.001
For YEAR
For INTERCEPT
θ1,δ100 21.064011 1.140716 18.466 66 <0.001
ht
For G4D1
For INTERCEPT
θ2,δ200 2.753381 2.371599 1.161 7338 0.246#
For G4D21
For INTERCEPT
θ3,δ300 0.231710 3.584218 0.065 7338 0.949#
For G5D22
For INTERCEPT
©
θ4,δ400 7.507799 2.332107 3.219 7338 0.002#
For TWOWAY
For INTERCEPT
θ5,δ500 1.160337 4.322456 0.268 7338 0.788#
The p-vals above marked with a "#" should regarded as a rough approximation.
SS
I
231
Final estimation of fixed effects (with robust standard errors)
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1
For INTERCEPT
θ0,δ000 609.850986 1.954775 311.980 66 <0.001
For YEAR
For INTERCEPT
θ1,δ100 21.064011 1.112653 18.931 66 <0.001
Co
For G4D1
For INTERCEPT
θ2,δ200 2.753381 2.927131 0.941 7338 0.347#
For G4D21
For INTERCEPT
θ3,δ300 0.231710 4.389057 0.053 7338 0.958#
For G5D22
For INTERCEPT
py
θ4,δ400 7.507799 3.019164 2.487 7338 0.013#
For TWOWAY
For INTERCEPT
θ5,δ500 1.160337 6.470068 0.179 7338 0.858#
rig
The p-vals above marked with a "#" should regarded as a rough approximation.
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
θ0,b00 27.73401 769.17514 2172 11413.58016 <0.001
ht
YEAR/θ1,b10 4.60719 21.22623 2172 2177.42726 0.463
level-1, e 17.45913 304.82130
Note: The chi-square statistics reported above are based on only 2173 of 4216 units that had sufficient
data for computation. Fixed effects and variance components are based on all the data.
Note: The chi-square statistics reported above are based on only 495 of 498 units that had sufficient data
SS
for computation. Fixed effects and variance components are based on all the data.
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
θ0,d00 13.01222 169.31794 64 256.96222 <0.001
I
Note: The chi-square statistics reported above are based on only 65 of 67 units that had sufficient data
for computation. Fixed effects and variance components are based on all the data.
232
As reported by Hong and Raudenbush (2008), no significant causal effect of Grade 4 treatment on
Grade 4 outcomes. A positive and significant effect of Grade 5 treatment on Grade 5 outcome,
δˆ400 = 7.51 (SE = 3.019, t = 2.487) 6.
Deviance = 70731.304874
Number of estimated parameters = 16
Co
15.3 Other program features
HCM3 models provide options similar to those of HCM2. It also allows users to diagonalize the τπ, τβ,
and τγ when estimating the variance components if interests focus only on the diagonal elements of
any of the three matrices. In addition, design weights are allowed for level-1, level-2 row factor and
level-3 units.
py
rig
ht
©
SS
I
6
We used an improved algorithm here and thus the results are a bit different from those published in
Hong and Raudenbush (2008).
233
16 Conceptual and Statistical Background for Hierarchical
Linear Model with Cross-Classified Random Effects (HLMHCM)
In HCM2, level-1 units are nested within cells and cross-classified by two higher-level factors.
Co
HLMHCM adds a level within the cells. For example, we may have a growth model for each of a set
of students, all of whom live in the same neighborhood and attend the same school. We would say
that level-1 units (repeated measures) are nested within level-2 units (children); level-2 units are
crossed by rows (neighborhoods) and columns (schools). Another example might involve repeated
item responses at a given time for a student encountering a given teacher. The level-1 units are the
item responses, nested within occasions (level-2) crossed by rows (students) and columns (teachers).
py
16.1 The general hierarchical linear model with cross-classified random
effects
A general hierarchical HLMHCM has three sub-models: a level-1 model and a level-2 model within
each cell; and a level-3 model or between-cell model that incorporates row and column effects.
rig
Formally, there are m = 1,2,…, nijk level-1 units (e.g., repeated measurement of student achievement)
nested within level-2 (e.g., students) I =1,…, njk nested within cells cross-classified by j = 1,..., J
rows (e.g., neighborhoods) and k = 1,..., K columns (e.g., schools).
ht
Here is an example of a data layout for three waves of developmental data (nijk = 3) nested within J =
10 students nested within cells cross-classified by J = 3 neighborhoods (rows) and K = 3 schools
(columns):
Table 16.1 indicates that the repeated developmental data are nested within individual students
nested within cells cross-classified by neighborhoods and schools. Note that unlike in HCM3, the
students never leave the neighborhood or school of origin.
I
234
Ymijk = ψ 0ijk +ψ 1ijk a1ijk + ψ 2ijk a2ijk + ⋅⋅⋅ + ψ pijk a pijk + ξ mijk
P (16.1)
= ψ 0ijk + ψ pijk a pijk + ξ mijk
p =1
where
ψ 0ijk is the intercept, the expected value of Ymi jk when all explanatory variables are set to zero;
Co
ψ pijk are level-1 coefficients of predictors a pijk ( p=1,2,…,P) ;
ξ mijk is the level-1 random effect; and
σ 2 is the variance of ξ mijk , that is the level-1 variance. Here we assume that the random term
ξ mijk N (0,σ 2 ) .
Qp (16.2)
= π p 0 jk + π pqjk α p 0 jk + e pijk
q =1
π p 0 jk is the intercept, the expected value of ψ pi jk when all explanatory variables are set to zero;
ht
π pqjk are level-1 coefficients of predictors α p 0 jk (p=1,2,…,P);
e pijk is the level-2 or within-cell random effect, and
τ is the variance-covariance matrix of e pijk , that is the level-2 variance. Here we assume that the
random term e pijk ~ N (0,τ ) . The vector containing elements e pijk is assumed multivariate
normal with a mean zero and a full covariance matrix, τ .
©
16.1.3 Level-3 model or "between-cell" model
Each of the π pqjk (q = 0, 1, …, Qp) coefficients in the level-2 or within-cell model becomes an
outcome variable in the level-3 or between-cell model:
SS
π pqjk = θ pq0 + (β pq1 + bpq1 j ) X1k + (β pq2 + bpq 2 j ) X 2k + ⋅⋅⋅ + (β pqR + bpqR j ) X R j +
p p q
where
235
randomly over rows j = 1,..., J;
γ p qs are the fixed coefficients of row-specific predictors Ws j , s = 1,..., S p ;
c p qs k are the random effects associated with row-specific predictors Wsj . They vary randomly
over columns k = 1,…, K; and
bpq rj , and c p qs k are residual row- and column-specific random effects, respectively, on π p qjk ,
after taking into account X r k and Wsj .
Co
The vector containing elements bpqrj is assumed multivariate normal with a mean zero and a full
covariance matrix Ω . Similarly, the vector with elements c pqsk is assumed multivariate normal with
mean vector zero and full covariance matrix Δ .
236
17 Working with HLMHCM
Co
17.1 An example using HLMHCM in Windows mode
HLMHCM analyses can be executed in Windows, interactive, and batch modes. We describe a
Windows execution below. We consider interactive and batch execution in Appendix I. A number of
special options are presented at the end of the chapter.
Chapter 8 in Hierarchical Linear Models and Chapter 4 of this manual provide examples of HLM3
py
analyses of repeated measures data nested within students within schools collected by the US
Sustaining Effects Study and by an urban school effects study, respectively. To illustrate the
operation of the HLMHCM program, we perform another achievement growth analysis. Unlike the
previous examples, however, this analysis considers not only the school but the neighborhood
contexts within which the students resided in as well. The data were obtained from 567 students
from 224 schools in 74 urban neighborhoods in which repeated achievement measures are nested
rig
within students cross-classified by schools and neighborhoods. We chose a similar set of covariates
to allow users to compare and contrast these set of models with those HLM3 models executed in
Chapter 4.
Level-1 file. The level-1 or within-cell file, GROWTH.SAV has 2008 observations collected on 567
students beginning at grade one and followed up annually thereafter for six years. Figure 17.1 shows
the time series data for the first three students. All of them have complete data; typically there are
three or four observations per child. Following the student ID field are that student's values on two
variables:
I
237
• AGE8
The age of the child minus 8 at each testing occasion. Therefore, it is 0 at age 8, 1 at age 9,
etc.
• MATH
A math test score in an IRT metric.
Co
py
rig
Figure 17.1 First 18 records in the GROWTH.SAV dataset
We see that the first student was about seven and a half years old (AGE8 = –0.420) during the first
ht
data collection wave with a math score of 2.1.
Level-2 file. The level-2 units in the illustration are 567 students. The data are stored in the file
STUDENT.SAV. The level-2 data for the first eight children are listed in Figure 17.2. The first ID is
the level-3 row-factor (i.e., school) ID, the second ID is the level-3 column factor (i.e., neighbor) ID,
and the third ID is the level-2 (i.e., student) ID. Note that the level-2 files must be sorted in the
©
same order of level-2 ID.
We see, for example, that student 1 who attended school 175 and resided in neighborhood 68 is a
African-American male (FEMALE = 0, BLACK = 1, HISPANIC = 0).
I
238
Co
Figure 17.2 First 10 cases in the STUDENT.SAV dataset
Level-3 row-factor file. The level-3 row-factor (school) level file, SCHOOL.SAV, consists of data on 1
py
variable for 224 schools. The variable is SCHPOV, which is an indicator of school poverty, as
measured by the percentage of the total number of students enrolled in free or subsidized lunch
programs.
We see that the first school, school 1, has 91% of its students enrolled in free or subsidized lunch
rig
programs.
ht
©
Level-3 column-factor file. The level-3 row-factor (neighborhood) level file, NEIGH.SAV, consists of
data on 1 variable for 74 neighborhoods. The variable is DISADV (a scale measuring social
SS
deprivation, which incorporates information on the poverty concentration, health, and housing stock
of a local community). A measure of neighborhood disadvantage, constructed through an oblique
factor analysis from the 1990 decennial census data, tapped the level of poverty and unemployment,
and the percentage of families that were headed by females and percentage on welfare (Sampson &
Raudenbush, 1999; Sampson, Raudenbush, & Earls, 1997).
I
239
Co
Figure 17.4 First 8 cases in the NEIGH.SAV data set
In sum, there are two variables at level 1, three at level 2, and one for each of the level-3 factors.
py
rig
ht
©
240
Co
py
Figure 17.6 Choose variables HLMHCM dialog box for level-1 file, GROWTH.SAV
rig
ht
©
Figure 17.7 Choose variables HLMHCM dialog box for level-2 file, STUDENT.SAV
SS
I
241
Co
py
Figure 17.8 Choose variables HLMHCM dialog box for level-3 row-factor file, SCHOOL.SAV
rig
ht
©
Figure 17.9 Choose variables HLMHCM dialog box for level-3 column-factor file,
NEIGH.SAV
SS
1. Specification of the level-1 model. In our case we shall model mathematics achievement (MATH)
as the outcome, to be predicted by AGE8. Hence, the level-1 model will have two coefficients for
each student: the intercept and the AGE slope.
2. Specification of the level-2 prediction model. Here each level-1 coefficient – the intercept and
the AGE8 slope in our example – becomes an outcome variable. We may select certain student
242
characteristics to predict each of these level-1 coefficients. In principle, the level-2 parameters
then describe the distribution of growth curves cross-classified by schools and neighborhoods.
3. Specification of level-1 coefficients as random or non-random across level-two units. We shall
model the intercept and the AGE8 slope as varying randomly across the students cross-classified
by schools and neighborhoods.
4. Specification of the level-3 row- and/or column-factor prediction model. Here each level-2
coefficient becomes an outcome, and we can select row- and/or column-factor variables to
predict school-to-school and neighbor-to-neighbor variation in these level-2 coefficients. In
Co
principle, this model specifies how schools and neighborhoods differ with respect to the
distribution of growth curves within them.
5. Specification of the residual row and column as random or non-random, the effects associated
with row-specific predictors as varying randomly or fixed over columns, and the effects
associated with column-specific predictors as varying randomly or fixed over rows. We shall test
whether the associations between neighborhood disadvantage (a column-specific predictor) and
growth parameters vary over schools.
py
Following the five steps above, we first specify a model with no student-, neighborhood-, or school-
level predictors. The Windows execution is very similar to the one for HCM2 as described in Section
11.2. The command file, GROWTH1.HLM, contains the model specification input responses. Figure
17.10 displays the model specified.
rig
Outcome Intercept
AGE8 slope
ht
©
Level-2 Model
Row/Column Model
τ
INTRCPT1 AGE8
INTRCPT2,e0 INTRCPT2,e1jk
0.27574 0.07972
©
0.07972 0.03283
τ (as correlations)
1.000 0.838
0.838 1.000
Note that the estimated correlation between true status at AGE = 8 and true rate of change is
SS
estimated to be 0.838 for students in the same cell cross-classified by schools and neighborhoods.
Ω
INTRCPT1 AGE8
INTRCPT2 INTRCPT2
ICPTROW,b000 ICPTROW,b100
I
0.10927 -0.00606
-0.00606 0.00580
244
Ω (as correlations)
1.000 -0.241
-0.241 1.000
Note that the estimated correlation between true school mean status at AGE = 8 and true school-mean
rate of change is estimated to be -0.241.
Δ
Co
INTRCPT1 AGE8
INTRCPT2 INTRCPT2
ICPTCOL,c000 ICPTCOL,c100
0.02840 0.01363
0.01363 0.00720
Δ (as correlations)
py
1.000 0.954
0.954 1.000
Note that the estimated correlation between true neighborhood mean status at AGE = 8 and true
neighborhood-mean rate of change is estimated to be 0.954.
rig
The value of the log-likelihood function at iteration 814 = -1.917348E+003
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
ht
INTRCPT2,
INTERCEPT, θ00 2.257403 0.042925 52.589 274 <0.001
For AGE8, π1
INTRCPT2,
INTERCEPT, θ10 0.880177 0.016734 52.598 274 <0.001
©
The above table indicates that the average growth rate is significantly positive at 0.880 logits per
year, t = 52.598.
Standard Variance
Random Effect d.f. χ2 p-value
SS
Deviation Component
INTRCPT1, e0 0.52510 0.27574 268 4818.18751 <0.001
AGE8, e1jk 0.18119 0.03283 268 1465.94774 <0.001
σ2,ε 0.40561 0.16452
Note: The chi-square statistics reported above are based on only 526 of 567 units that had sufficient data
for computation. Fixed effects and variance components are based on all the data.
I
The results above indicate significant variability among children cross-classified by schools and
neighborhoods in terms of mean status at AGE = 8 (χ2 = 4818.18751, df = 268) and in terms of yearly
rate of change (χ2 = 1465.94774, df = 268).
245
Final estimation of row level variance components
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/ INTRCPT2/ ICPTROW,b000 0.33055 0.10927 224 87.39230 >0.500
AGE8/ INTRCPT2/ ICPTROW,b100 0.07616 0.00580 224 201.21512 >0.500
The results above indicate there is no significant variability among schools in terms of mean status
at AGE = 8 (χ2 = 87.39230, df = 224) and in terms of yearly rates of change (χ2 = 201.21512, df =
Co
224).
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/INTRCPT2/ ICPTCOL,c000 0.16851 0.02840 73 1316.77855 <0.001
py
AGE8/INTRCPT2/ ICPTCOL,c100 0.08484 0.00720 73 831.88840 <0.001
The results above indicate significant variability among neighbors in terms of mean status at AGE =
8 (χ2 = 1316.77855, df = 73) and in terms of yearly rates of change (χ2 = 831.88840, df = 73).
17.3 Specification of a level-2 and level-3 conditional model, with the effect
associated with a column-specific predictor fixed
ht
The above example involves a model that is unconditional at all levels. In this model we set up a
level-2 and a row-factor prediction model.
Select the equation containing ψ pijk to be modeled, a listbox for level-2 variables (>>Level-2<<) will
©
appear. Figure 17.12 shows the models with BLACK and HISPANIC as the level-2 predictors. In the
interest of parsimony, all level-2 coefficients are fixed. (To specify either of them as randomly
varying, select the equation containing a specific regression coefficient, π pqjk , and click on bp qrj
and/or c p qsk ).
SS
I
246
Co
py
rig
Figure 17.12 Level-2 prediction model for the growth study
Select the equation containing π pqjk to be modeled, a listbox for level-3 row-factor variables
ht
(>>Row<<) will appear. To display level-3 column-factor variables, click on and the
corresponding listbox of variables. Figure 17.13 shows the level-3 column-factor prediction model
with DISADV as the covariate. In the level-3 model, we treated the association between neighborhood
disadvantage and the growth parameters as fixed across all schools. Note that b001 j and b101 j are
disabled. We relax this assumption in our next model.
©
SS
I
247
Co
py
rig
Figure 17.13 Conditional model for the growth study, with neighborhood disadvantage
effect fixed
Level-1 Model
Level-2 Model
248
Row/Column Model
σ2 = 0.16386
py
τ
INTRCPT1 AGE8
INTRCPT2,e0 INTRCPT2,e1jk
0.27546 0.08088
rig
0.08088 0.03538
τ (as correlations)
1.000 0.819
0.819 1.000
Ω
ht
INTRCPT1 AGE8
INTRCPT2 INTRCPT2
ICPTROW,b000 ICPTROW,b100
0.09506 -0.00711
-0.00711 0.00320
©
Ω (as correlations)
1.000 -0.408
-0.408 1.000
Δ
INTRCPT1 AGE8
SS
INTRCPT2 INTRCPT2
ICPTCOL,c000 ICPTCOL,c100
0.01332 0.00656
0.00656 0.00338
Δ (as correlations)
I
1.000 0.979
0.979 1.000
249
Final estimation of fixed effects:
Standard Approx.
Fixed Effect Coefficient t-ratio p-value
error d.f.
For INTRCPT1, π0
INTRCPT2,
INTERCEPT, θ00 2.639580 0.090173 29.272 270 <0.001
DISADV, γ001 -0.001726 0.050288 -0.034 222 0.973
BLACK,
Co
INTERCEPT, θ01 -0.443355 0.103660 -4.277 270 <0.001
HISPANIC,
INTERCEPT, θ02 -0.468207 0.098680 -4.745 270 <0.001
For AGE8, π1
INTRCPT2,
INTERCEPT, θ10 0.933753 0.035488 26.312 270 <0.001
DISADV, γ101 -0.050330 0.020853 -2.414 222 0.016
BLACK,
py
INTERCEPT, θ11 -0.105109 0.040518 -2.594 270 0.010
HISPANIC,
INTERCEPT, θ12 -0.036124 0.038978 -0.927 270 0.354
Note: The chi-square statistics reported above are based on only 526 of 567 units that had sufficient data
ht
for computation. Fixed effects and variance components are based on all the data.
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/ INTRCPT2/ ICPTROW,b000 0.30832 0.09506 224 79.66634 >0.500
©
AGE8/ INTRCPT2/ ICPTROW,b100 0.05653 0.00320 224 182.46985 >0.500
Standard Variance
Random Effect d.f. χ2 p-value
Deviation Component
INTRCPT1/INTRCPT2/ ICPTCOL,c000 0.11543 0.01332 73 2085.34935 <0.001
SS
Deviance = 3800.651318
Number of estimated parameters = 18
I
250
The results suggest that:
• Compared to their reference group (non-Black and non-Hispanic); African and Hispanic
American students on average had a lower mathematics score at age 8 than did white
students. Also, African American students had a significantly lower growth rate in
mathematics achievement ( θ11 = -0.105, t = -2.594) than did white students.
• Neighborhood disadvantage had a negative association with the growth rate of the reference
group ( γ101 = -0.050, t = -2.414).
Co
• The column level variance at level 3 of each growth parameter was substantially reduced
(> 50%). The residual variation between neighborhoods in c000 (estimated at 0.01332) and in
c100 (estimated at 0.00338) are less than half of those in the unconditional models (0.02840
and 0.00720).
251
18 Graphing Data and Models
HLM2 and HLM3 provide the ability to make data-based and model-based graphs. Data-based graphs
allow examination of univariate and bivariate distributions. Model-based graphs, which can be produced
Co
by the HLM2, HLM3, HMLM, HMLM2 and HCM2 modules of WHLM, facilitate visualization and
presentation of analytic results for the whole or a subset of the population of interest. They also enable
users to check the tenability of underlying model assumptions.
6. Specify the arrangement of the plots by either (a) the original order of the groups as they appear
in the data set or (b) the median in an ascending order. Click on the selection button for median
in the Sort by section to arrange the box-and-whisker plots of MATHACH by median in an
ascending order (see Figure 18.2).
I
252
Co
py
rig
Figure 18.1 Choose Y for box plot dialog box
ht
©
SS
Figure 18.2 Choose Y for box plot dialog box for the MATHACH example
I
253
27.49
MATHACH 19.84
Co
12.19
4.54
py
-3.12
0 18.00
The figure gives side-by-side graphical summaries of the distributions of MATHACH for the sixteen
rig
schools sorted by median. The x-axis denotes number of schools in the display and the y-axis
mathematics achievement. The plot tells us that the first school from the left has a median score of
about 6.05, which is the lowest school median in this group. The distribution of the scores of the
students in this school is positively skewed and there is an outlier at the upper end.
The third and the fourth schools from the left have similar distributions of mathematics scores.
ht
Compared to the distribution of the scores of the adjacent school on the right, however, the scores of
these two schools display greater variability, as defined by the lengths of the boxes or interquartile
ranges. In addition, there is an outlier at the upper end of the distribution for the fifth school. The
highest median mathematics score among the 16 schools was 19.08.
©
8. (Optional) WHLM allows users to list the raw data of a specific group that is graphically
summarized in one of the box-and-whisker plots as well. To see the data of a specific level-2
unit, click on one of the box-and-whisker plots (near the median is usually a good place) in
Figure 18.3, which brings up the following dialog box:
SS
I
254
Co
py
Figure 18.4 Box & Whisker Attributes dialog box
rig
For a description of the options, see Table 18.1.
Click Data and then a dialog box containing the data of a specific group will appear. In our example,
we examine the raw scores of the school with the highest median (see Figure 18.5). The title bar of
Figure 18.5 tells us the level-2 ID of the box-and-whisker plot we selected is 3427. # is a zero-based
counter for group plots.
ht
©
As the box-and-whisker plots are plotted individually in the example, it is 0. X tells us that the data
are from the thirteenth school displayed on the plot. Y1 to Y11 list the mathematics scores for the
first eleven students in School 3427. Move the bottom scroll box to the left to display more scores
SS
10. (Optional) To print the graph, open the File menu, select Print current page or Print selected
graph when there are more than one graph. Users can choose Printing Options... to change
printing parameters such as choice of background, border type, aspect ratio (the ratio of the x-
axis length to the y-axis length, the default is 5/3), and printing style.
255
Table 18.1 Definitions and options in the Box & Whisker Attributes dialog box
11. To save the graph for future use by opening the File menu and choose Save as metafile. A
Save as dialog box will open. Enter a filename for the file and click OK. The file can be saved
ht
as an Enhanced Metafile (.EMF) (default and preferred as it holds more information than the
other option) or Windows Metafile (.WMF). Users can use word processing programs to insert
the graph file into the text. For example, to insert the saved .EMF file into Word, choose Insert-
...Picture-...From File from Word's main menu.
12. (Optional) To make modifications to the specifications, select Graph Settings. The Equation
Graphing dialog box will appear. We are going to illustrate this by adding a level-2
©
classification variable next.
13. After choosing the Y-Axis variable, select the level-2 classification variable in the Z-focus drop-
down listbox. There are two types of level-2 classification variables, categorical and continuous.
SS
For categorical variables, WHLM will classify the plots with the levels of the variables. For
continuous variables, users can choose either to dichotomize them using median splits, or
trichotomize them into three groups: (a) 0 to 24th percentile; (b) 25th to 75th percentile; and (c)
76th percentile and above. These two options, available when a continuous classification variable
is chosen, can be found in the lower Z-focus drop-down listbox. In our example, we will choose
school sector, Catholic vs. public school, as the classification variable. To continue working on
I
the plot we have just made, click Graph Settings to open the Equation Graphing dialog box.
Select SECTOR in the Z-focus dialog box. The following graph will be displayed (see Figure
18.6).
256
27.49
SECTOR = 0
SECTOR = 1
19.84
Co MATHACH
12.19
py
4.54
-3.12
0 18.00
rig
Figure 18.6 Box-and-whisker plots for MATHACH for a random sample of schools as
classified by school sector
In the graph, the box-and-whisker plots for Catholic and public schools are coded differently (red for
ht
Catholic and blue for public schools). The colored graphs (not showed here) suggest that the three
schools that have the highest median mathematics scores are Catholic schools. The school with the
lowest average belongs to the public sector.
Users can edit the legends by clicking on them in the graph above to open the Legend Parameters
dialog box (see Figure 18.7), which allows them to make changes in the titles of the legends, their
©
sizes and font types, and the display of the legend box. For example, one may like to change
SECTOR = 0 in the text box of Figure 18.7 to PUBLIC = 0 and SECTOR = 1 to CATHOLIC = 1.
SS
I
257
18.1.2 Scatter plots
In the previous section, we illustrated how to graphically summarize and compare univariate
distributions of level-1 variables, with and without a level-2 classification variable. Now we
demonstrate how to use data-based scatter plots to explore bivariate relationships between level-1
variables for individual or a group of level-2 units, with and without controlling level-2 variables.
We will continue to use the HS&B data set and we are going examine the relationships between
MATHACH and SES for a group or individual schools, with and without controlling for the sector of
Co
the school.
7. Select type of plot. Users can select one of the two major types of plots: (a) scatter plot; and (b)
line plot with and without markers or asterisks showing where the data points are. Click the
selection button for Scatter plot (default) for this example.
258
8. Select type of pagination. There are three options: (a) all groups on the same graph (default); (b)
one graph per groups and to display a maximum of eight graphs on one page, and (c) 1 graph per
group and to be displayed on multiple pages. In this example, we will display the bivariate
relationship between SES and MATHACH for all the selected schools on a single graph. We
choose the option All groups on same graph accordingly.
9. Click OK to make the scatter plot. This gives us the following graph (see Figure 18.9), indicating
a moderate positive association between SES and MATHACH, and suggesting that both variables
have "ceilings" (upper limits).
Co
10. For more information on the editing, printing, saving and modification options, see Steps 11 to
13 in Section 18.1.1.
py
26.38
18.73
rig
MATHACH
11.08
ht
3.43
-4.22
©
-3.07 -1.81 -0.55 0.70 1.96
SES
Figure 18.9 Scatter plot for the 20% random sample of cases
11. After specifying the variables for the x- and y-axis, select the controlling variable from the Z-
focus drop-down listbox. As in the case for the box-and-whisker plots, users can choose either a
categorical and continuous controlling variable (see Step 14 in Section 18.1.1). In our example,
we will choose school sector, Catholic vs. public school, as the controlling variable. To continue
working on the scatter plot we have just made, click Graph Settings to open the Equation
I
Graphing dialog box. Select SECTOR in the Z-focus dialog box. The following graph will be
displayed (see Figure 18.10).
259
26.38
SECTOR = 0
SECTOR = 1
18.73
CoMATHACH
11.08
py
3.43
-4.22
rig
-1.96 -0.86 0.25 1.36
SES
It may be helpful to use a different pagination option to help us to discern the relationships for these
two groups of school. Instead of having all the groups on the same graph, we select the 1
graph/group, multiple/page pagination option. This gives us Figure 18.11, where we see how the
©
two groups of schools vary in their SES and MATHACH distributions. Note, for example, that school
8946 has high levels of SES and that in school 4325, the association between SES and MATHACH
appears a bit stronger than in several of the other schools. WHLM puts a maximum of 8 groups in a
window. We can page back and forth using the -> and <- buttons in the lower right corner of the
window to display the scatter plots for other schools.
SS
I
260
SECTOR = 0 SECTOR = 1
Lev-id 1637 Lev-id 3088
26.38 26.38
18.73 18.73
MATHACH
MATHACH
11.08 11.08
3.43 3.43
-4.22 -4.22
-2.71 -1.60 -0.50 0.61 1.71 -2.71 -1.60 -0.50 0.61 1.71
Co
SES SES
18.73 18.73
MATHACH
MATHACH
11.08 11.08
3.43 3.43
-4.22 -4.22
py
-2.71 -1.60 -0.50 0.61 1.71 -2.71 -1.60 -0.50 0.61 1.71
SES SES
18.73 18.73
MATHACH
MATHACH
11.08 11.08
rig
3.43 3.43
-4.22 -4.22
-2.71 -1.60 -0.50 0.61 1.71 -2.71 -1.60 -0.50 0.61 1.71
SES SES
MATHACH
11.08 11.08
3.43 3.43
-4.22 -4.22
-2.71 -1.60 -0.50 0.61 1.71 -2.71 -1.60 -0.50 0.61 1.71
©
SES SES
As an elaboration of this, we can also choose on the Graph Settings dialog box to have each
group's plot in a separate graph by choosing 1 graph/group, 1/page, as shown below:
SS
I
261
Lev-id 5640
26.38
SECTOR = 0
SECTOR = 1
18.73
Co
MATHACH
11.08
py
3.43
-4.22
-1.60 -0.48 0.64 1.77
rig
SES
• Age in months
SS
AGE
• VOCAB Vocabulary size
• AGE12 Age in months minus 12
• AGE12Q AGE12*AGE12
The level-2 data file, VOCABL2.SAV, consists of 22 children and an indicator variable for gender
I
262
To prepare a scatter plot
Figure 18.13 Choose X and Y variables dialog box for line plot of VOCAB and AGE
9. Click OK to make the line plot. The following graph will appear.
I
263
824.20
608.60
VOCAB
Co
393.00
177.40
py
-38.20
11.30 15.15 19.00 22.85 26.70
AGE
Figure 18.14 Line plot of the vocabulary score vs. the age of the child
rig
We see that, for all children, vocabulary size is near zero at around a year of age (12 – 15 months)
and that for each child, vocabulary size increases, typically quite rapidly during the second year of
life.
264
824.20
FEMALE = 0
FEMALE = 1
608.60
Co
VOCAB
393.00
py
177.40
-38.20
rig
11.30 15.15 19.00 22.85 26.70
AGE
Figure 18.15 Cubic interpolation line plot of the difference between boys and girls
ht
18.2 Model-based graphs – two level
265
The level-2 data file, NYSB2.SAV, consists of 241 youths and three variables per participant.
At level-1, we formulate a polynomial model of order 2 using AGE16 and AGE16S (see Figure 18.16)
Co
with FEMALE and MINORITY as covariates at level-2 modeling π 0 , the expected pro-deviant attitude
score at age 16 for subject j; π 1 and π 2 , which are the expected average linear and quadratic growth
rate for pro-deviant attitude score respectively. The procedure for setting up the model is given in
2.5.2. We will ask WHLM to graph the predicted values of pro-deviant attitude scores at different
ages for different gender-by-ethnicity groups.
py
rig
ht
Figure 18.16 A polynomial model of order 2 with FEMALE and MINORITY as level-2
covariates
1. After running the model, select Basic Settings to open the Basic Model Specifications –
HLM2 dialog box.
2. Enter a name for the graphics file. The default name is grapheq.geq.
3. Enter a title and name the output filename, save the command file, and run the analysis as
SS
266
Co
py
rig
Figure 18.17 Equation Graphing – Specification dialog box
We now proceed to select the predictor variables and specify their ranges or values, and choose the
graphing functions and the various attributes of the plot for the polynomial model represented in
Figure 18.16, as described in Steps 5 to 14 below.
ht
5. Select AGE16 in the X focus Level 1 drop-down listbox to graph pro-deviant attitude score as a
function of age.
6. Select Entire range in the Range of x-axis drop-down listbox to include the entire range of age
on the x axis in the graph.
©
SS
I
267
Table 18.2 Definitions and options in the Equation Graphing dialog box
variables
268
7. Click 1 in the Categories/transforms/interactions section and select power of x/z for
Polynomial relationships. An Equation Graphing - power dialog box will open (see Figure
18.18).
Co
Figure 18.18 Equation Graphing – power dialog box
8. The textbox to the left of the equal sign is for the entry of the transformed variable. Select
AGE16S in the drop-down listbox (see Figure 18.19). The textbox to the right is for the entry of
the original variable. AGE16 will appear in the drop-down listbox as it is the only level-1
py
variable left. Enter 2 in the textbox for the power to be raised. Click OK.
rig
Figure 18.19 Equation for the transformed variable AGE16S
9. Click Range/Legend/Color to specify the ranges for x- and y-axis (the default values are those
ht
computed from the data), to enter legend and graph titles, and to select screen color (see Figure
18.20). Enter Pro-deviant attitude score as a function of age, gender and ethnicity in the
textbox for Graph title. Click OK.
©
SS
10. Click the Other settings button and click the selection button for Smooth in For continuous x
section to display a set of smooth curves.
269
11. Select FEMALE in the Z focus(1) drop-down listbox to graph pro-deviant attitude score as a
function of age for male and female youths. Use the two actual values will appear in the
textbox for the Range of z-axis as FEMALE is an indicator variable. We will use this default
option.
12. Select MINORITY in the Z focus(2) drop-down listbox to graph pro-deviant attitude score as a
function of age for minority and non-minority male and female youths. Use the two actual
values will appear in the textbox for the Range of z-axis as MINORITY again is an indicator
variable. We will use this default option. See Figure 18.21 for the specifications for this growth
Co
curve analysis example.
13. Click OK. A colored version of the plot (not displayed here) showing the relationship between
pro-deviant attitude score and age for different gender-by-ethnicity groups will appear (see
Figure 18.22). The curves indicate that there is a nonmonotonic and nonlinear relationship
between pro-deviant attitude scores and age for minority and non-minority male youths over the
five year period. Such a relationship, however, does not exist for minority and non-minority
female youths.
py
14. For information on the editing, printing, saving, and modification options, see Steps 11 to 13 in
section 18.1.1.
rig
ht
©
SS
270
Pro-deviant attitude score as a function of age, gender and ethnicity
0.58
FEMALE = 0,MINORITY = 0
FEMALE = 0,MINORITY = 1
FEMALE = 1,MINORITY = 0
FEMALE = 1,MINORITY = 1
0.53
Co ATTIT
0.48
0.43
py
0.38
-2.00 -1.01 -0.02 0.97 1.96
AGE16
Figure 18.22 Plot showing the relationship between pro-deviant attitude score and age for
different gender-by-ethnicity groups
rig
18.2.2 Level-1 equation modeling
WHLM will also let us examine plots for individual level-2 units by just using the level-1 equation
instead of the entire model. For this example, we will be using the vocabulary data, VOCAB.MDM
described in section 18.1.2, and have run the following model:
ht
©
SS
1. After the model is run, select Graph Equations...Level-1 equation graphing from the File
I
271
Co
py
Figure 18.24 Level-1 equation Graphing dialog box
For the definition of Number of groups, see step 5 in section 18.1. Table 18.2 describes and
explains the other options in the dialog box.
rig
2. Select an X focus variable. In our example, we want the age of the child in months minus 12 to
be the X focus. Choose AGE12 from the X focus drop-down listbox.
3. Select number of groups. We will include all the children. Choose All groups (n=22) in the
Number of groups drop-down listbox.
4. Specify the relationship between the transformed and the original variable. The transformed
ht
variable is AGE12S and the original variable is AGE12. Click 1 in the Categories/
transforms/interactions section and select power of x/z for Polynomial relationships. A
Equation Graphing - power dialog box will open. Select AGE12S from the drop-down listbox
to the left of the equal sign. AGE12 will appear in the drop-down listbox as it is the only level-1
variable left. Enter 2 in the textbox for the power to be raised. Click OK.
5. (Optional) click Range/Legend/Color to specify the ranges for x- and y-axis (the default values
©
are those computed from the data), to enter legend and graph titles, and to select screen color.
6. Click the Other settings button and click the selection button for Smooth in For continuous x
section to display a set of smooth curves. Click OK.
7. Click OK and we get the following figure that shows vocabulary size accelerates during the
second year of life. Note that the individual trajectories, as expected, are "smoother" than in the
comparable data-based graphs in Figure 18.14 in Section 18.1.3.
SS
I
272
779.35
579.09
Co
VOCAB
378.82
py
178.56
-21.70
rig
0 3.50 7.00 10.50 14.00
AGE12
273
779.35
FEMALE = 0
FEMALE = 1
579.09
Co
VOCAB
378.82
py
178.56
-21.70
rig
0 3.50 7.00 10.50 14.00
AGE12
1. After the model is run, select Graph Equations...Level-1 box whisker from the File menu,
SS
274
Co
py
rig
Figure 18.27 Choose Y for box plot dialog box
For definitions of the options in the dialog box, see Section 18.1.1. Note that the variable for Y-axis,
level-1 residual has been pre-selected.
ht
2. Select All groups (n=22) in the Number of groups to include all the 22 children in the display.
3. Click the selection button for median in the Sort by section to arrange the plots by median
order.
4. Click OK. The following graph will appear.
©
SS
I
275
81.90
46.55
CoLevel-1 Residual
11.20
py
-24.14
-59.49
rig
0 6.00 12.00 18.00 24.00
6. (Optional) Users can choose to include a level-2 classification variable when examining the
level-1 residuals. See Step 14 in Section 18.1.1.
I
276
data and model of the previous two sections, we now plot the level-1 residual against its predicted
value.
1. After the model is run, select Graph Equations...Level-1 residual vs predicted value from the
File menu, which will give us the following dialog box.
Co
py
rig
ht
Figure 18.29 Choose X and Y variables
For definitions of the various options in the dialog box, see Section 18.1.2. Note that the X-axis
©
variable, Pred. val. and Y-axis variable, Level-1 residuals have been pre-selected.
2. Select All groups (n=22) in the Number of groups to include all the 22 children in the display.
3. Click the selection button for Scatter plot in the Type of plot section to request a scatter plot of
the predicted values by level-1 residuals.
SS
4. Select All groups on same graph in the Pagination section to display all the residuals pooled
across the level-2 units. To examine the residuals for individual children, choose either of the
other pagination options.
5. Click OK.
I
277
74.45
42.32
Co
Level-1 Residual
10.19
py
-21.95
-54.08
rig
-21.70 178.56 378.82 579.09 779.35
computational formulae). This enables us to compare level-2 units with respect to these two types of
estimates.
1. After the model is run, select Graph Equations...Level-2 EB/OLS coefficient confidence
intervals from the File menu, which will give us the following dialog box:
278
Co
Figure 18.31 95% Confidence Intervals dialog box
py
For definitions about the various options regarding Y- and Z-focus and sorting, see Section 18.1.1.
2. Choose the randomly varying level-1 coefficient of interest. We will look at the coefficient for
rig
the quadratic term or acceleration rate of vocabulary growth in this example. Choose AGE12S
from the Y-focus drop-down listbox.
3. Select All groups (n=22) in the Number of groups to include all the 22 children in the display.
4. Click the EB residual button in the Type of residual section to select the empirical Bayes
estimates.
5. Click OK. The following graph will appear.
ht
The graph suggests that there is significant variation in the rate of acceleration in vocabulary growth
in children during the second year of life. For instance, the confidence intervals of the EB estimates
of the AGE12S coefficients for the last four children from the left did not overlap with those of the
first eleven children.
©
6. Users can look at the actual empirical Bayes estimates and their 95% confidence intervals of
individual level-2 units by clicking on the confidence interval plots.
7. (Optional) Users can choose to include a level-2 classification variable when examining the
confidence interval plots. See Step 14 in Section 18.1.1.
SS
I
279
4.08
3.22
Co AGE12S
2.35
py
1.48
0.61
rig
0 6.00 12.00 18.00 24.00
Figure 18.32 Confidence intervals of empirical Bayes estimates of the AGE12S coefficients
ht
18.2.6 Graphing categorical predictors
Model graphs can be displayed in which predictor variables are categorical. Suppose, for example,
that the variable ETHNICITY has three possible values: BLACK, HISPANIC, and WHITE and that this
variable is represented by indicator variables for BLACK and HISPANIC, with WHITE serving as the
reference category. To represent ethnicity as a predictor, click the first box under Categories/
©
transformations/interactions. Next, click on define categorical variable. Then four boxes will
appear:
1. Under the box Choose first category from foci click on the variable that is the first of the
indicator variables in the model. In our example, this will be BLACK.
2. Under the box Possible choices click on any other indicators in the model that represent the
SS
280
18.3 Three-level applications
Graphing with 3-level data is very similar to the 2-level graphing. The only two differences are that
users can (a) group the plots at either level 2 or 3, and (b) choose exclusively a level-2 or level-3
classifying or conditioning variable. To illustrate these two differences, we will use the EG.MDM as
describe in Section 4.1. We will prepare line plots of the mathematics test score, MATH, to detect
trends over the course of the six-year study, grouped by the level-3 units, schools, and classified by a
level-3 variable, the socioeconomic composition of schools. The same logic applies to the sets of
Co
three-level model-based graphing procedure.
281
Co
py
rig
Figure 18.33 Choose X and Y variables dialog box
ht
10. Click OK. The following graph will appear.
The eight line plots indicate the collection of students' growth trajectories of mathematics
achievement within individual schools. The schools varied in their number of students. There was a
generally positive average rate of growth across all schools.
©
SS
I
282
Lev-id 2020 Lev-id 2040
4.28 4.28
2.04 2.04
MATH
MATH
-0.19 -0.19
-2.43 -2.43
-4.66 -4.66
-2.75 -1.38 0 1.38 2.75 -2.75 -1.38 0 1.38 2.75
YEAR YEAR
Co
Lev-id 2180 Lev-id 2330
4.28 4.28
2.04 2.04
MATH
MATH
-0.19 -0.19
-2.43 -2.43
-4.66 -4.66
-2.75 -1.38 0 1.38 2.75 -2.75 -1.38 0 1.38 2.75
YEAR YEAR
py
Lev-id 2340 Lev-id 2380
4.28 4.28
2.04 2.04
MATH
MATH
-0.19 -0.19
-2.43 -2.43
rig
-4.66 -4.66
-2.75 -1.38 0 1.38 2.75 -2.75 -1.38 0 1.38 2.75
YEAR YEAR
Lev-id 2390 Lev-id 2440
4.28 4.28
2.04 2.04
MATH
MATH
-0.19 -0.19
ht
-2.43 -2.43
-4.66 -4.66
-2.75 -1.38 0 1.38 2.75 -2.75 -1.38 0 1.38 2.75
YEAR YEAR
Figure 18.34 Line plots of MATH against YEAR for eight schools
©
To include a level-3 classification variable
11. Now we want to look at the trajectories as classified by the socioeconomic composition of the
study body of a school. On the menu of the graph dialog box, click Graph Settings. Choose the
level-3 variable LOWINC, the percent of students from low income families, as a Z-focus
variable. As LOWINC is a non-dichotomous variable we have an additional choice that was not
needed for our earlier dichotomous z-foci. In this case, we choose Above/Below 50th
SS
percentile from the combo box immediately below where we chose the LOWINC as the grouping
variable.
12. Click OK. The following graph will appear.
I
283
LOWINC: lower half LOWINC: upper half
Lev-id 2020 Lev-id 2040
4.28 4.28
2.04 2.04
MATH
MATH
-0.19 -0.19
-2.43 -2.43
-4.66 -4.66
-2.75 -1.38 0 1.38 2.75 -2.75 -1.38 0 1.38 2.75
YEAR YEAR
Lev-id 2180 Lev-id 2330
4.28 4.28
Co
2.04 2.04
MATH
MATH
-0.19 -0.19
-2.43 -2.43
-4.66 -4.66
-2.75 -1.38 0 1.38 2.75 -2.75 -1.38 0 1.38 2.75
YEAR YEAR
MATH
-0.19 -0.19
-2.43 -2.43
-4.66 -4.66
-2.75 -1.38 0 1.38 2.75 -2.75 -1.38 0 1.38 2.75
YEAR YEAR
Lev-id 2390 Lev-id 2440
rig
4.28 4.28
2.04 2.04
MATH
MATH
-0.19 -0.19
-2.43 -2.43
-4.66 -4.66
-2.75 -1.38 0 1.38 2.75 -2.75 -1.38 0 1.38 2.75
YEAR YEAR
ht
Figure 18.35 Line plots of MATHACH against YEAR for eight schools by LOWINC
This shows us that schools with a greater percent of students from low income families (upper high)
tend to have lower mathematics achievement than do schools with less percent of poor students.
Compared to their peers in School 2020, for instance, students in School 2330 generally have lower
achievement across the six years.
©
SS
I
284
A Using HLM2 in interactive and batch mode
This appendix describes and illustrates how to use HLM2 in interactive and batch mode to construct
MDM files, to execute analyses based on the MDM file, and to specify a residual file to evaluate model
Co
fit. It also lists and describes command keywords and options. References are made to appropriate
sections in the manual where the procedures are described in greater details.
A.1.1 Example: constructing an MDM file for the HS&B data using SPSS file
py
input
In the computer session that follows, all responses entered by the user are typed in boldface. All text
presented in italics represents additional commentary we have added to help the user understand
what is happening in the program at that moment.
rig
C:\HLM> HLM2 (type the program name at the system prompt to start)
(See Section 2.5 for a description of the variables in HSB1.SAV and HSB2.SAV)
Note, had we indicated that missing data were present in the level-1 file, the following additional
prompts would have come to the screen:
Do you want to delete the missing data now, or at analysis time? Now
(Enter "now" or "analysis")
py
See Section 2.6 on how HLM2 handles missing data.
HLM2 -R CREATMDM.NEW
ht
where CREATMDM.NEW is the edited, renamed input file. The interactive prompts will be quickly
sent to the screen and "answered" (by responses read from the file CREATMDM.NEW), and the
program will proceed automatically to recreate the MDM file.
A.1.2 Example: constructing an MDM file for the HS&B data using ASCII file
©
input
C:\HLM> HLM2
Will you be starting with raw data? Y (type the program name at the system prompt to start)
Is the input file a v-known file? N
SS
Type? 1
format: (A2,8X,6F12.3)
Input name of level-2 file: HSB2.DAT
Note, had we indicated that missing data were present in the level-1 file, the following additional
prompts would have come to the screen:
rig
Is the missing value the same for all variables? Y
HLM2 allows for the possibility of a different missing value code for each variable. An answer of "N"
causes HLM2 to ask the user to enter value for missing variable for each level-1 variable.
ht
Enter the number that represents missing data. -99.0
Do you want to delete the missing data now, or at analysis time? Now
(Enter "now" or "analysis")
Note: we have selected group-mean centering for the level-1 predictor, SES.
An answer of "Y" here specifies a level-1 model without an intercept or constant term (see Section
2.9.6).
An answer of "Y" here causes HLM2 to list out the level-l coefficients and asks the user whether the
py
corresponding random effect should be set to zero. An answer of "Y" to one of these probes is
equivalent to specifying that level-1 coefficient as a fixed (or non-randomly varying) effect.
For all of the level-2 predictors selected here, HLM2 will compute approximate "t-to-enter statistics"
that can be used to guide specification of subsequent HLM2 models. Note, the code "-1" tells HLM2 to
use for the SES slope model the same set of level-2 predictors as selected for the previous level-2
SS
An answer of "Y" here causes HLM2 to ask which variables to be included in modeling sigma^2 (See
Section 2.9.3 for details) and the number of macro- and micro-iterations. See Table A1.5.1.
I
An answer of "Y" here causes HLM2 to ask the user which gamma is to constrained (See Section
2.9.8 and Table A1.5.1).
Do you wish to test the specification for the variance-covariance components against an
alternative model?(Note: the same fixed effects must be specified in both models) Y
ht
Enter the deviance statistic value 46512.978
Enter the number of variance-covariance parameters 4
OUTPUT SPECIFICATION
©
Do you want a residual file? N
290
A SPSS syntax file, RESFIL2.SPS, will be written out as the HLM2 runs. See Sections 2.5.4.1 and
2.5.4.2 on structure of the residual file and possible residual analyses.
While the program is running, HLM2 sends the value of the likelihood function computed for each
iteration to the screen. We have printed below just the first and last three. Because the change
between the 60th and 61st iterations was very small, the program automatically terminated before
the requested 100 iterations were computed. The sensitivity of this "automatic stopping value" can
be controlled by the user. See Section A1.5 for details.
py
Produced along with the output file is a file called "NEWCMD.HLM" which is a command file
constructed by HLM based on the interactive session just completed.
Should you wish to terminate the iterations prior to convergence, enter cntl-c
The value of the likelihood function at iteration 1 = -2.325291E+004
The value of the likelihood function at iteration 2 = -2.325274E+004
rig
The value of the likelihood function at iteration 3 = -2.325266E+004
.
.
.
The value of the likelihood function at iteration 59 = -2.325186E+004
The value of the likelihood function at iteration 60 = -2.325186E+004
The value of the likelihood function at iteration 61 = -2.325186E+004
ht
See Section 2.5.3 for an annotated example of the output for this model.
The file presented below is produced along with output file by HLM for the Intercept and Slopes-as-
Outcomes Model for the HS&B data specified in Section A.2.1. The italicized comments provide a
brief description of each command function. A complete overview of each of the keywords and
related options in this command file appears in the Section A.4.
I
#This command file was run with HSB.MDM Indicates which MDM was used.
NUMIT:100 Sets the maximum number of iterations.
STOPVAL:0.0000010000 Sets the criteria for automatically stopping the iterations.
NONLIN:N Switch to do a non-linear analysis.
LEVEL1:MATHACH=INTRCPT1+SES,1+RANDOM Specifies the level-1 model.
LEVEL2:INTRCPT1=INTRCPT2+SECTOR+MEANSES+RANDOM/SIZE,PRACAD,DISCLIM,HIMNTY
LEVEL2:SES=INTRCPT2+SECTOR+MEANSES+RANDOM/SIZE,PRACAD,DISCLIM,HIMNTY
291
Specifies the level-2 model and other level-2 predictors for
possible inclusion in subsequent models for both intrcpt1 and the ses slope.
LEVEL1WEIGHT: NONE Specifies level-1 weight variable.
LEVEL2WEIGHT: NONE Specifies level-2 weight variable.
RESFIL:N Controls whether a residual file is created.
HETEROL1VAR:N Specifies an analysis with a heterogeneous sigma2.
ACCEL:5 Controls frequency of use of accelerator.
LVR:N Specifies a latent variable regression model.
LEV1OLS:10 Controls the number of level-1 OLS regressions printed out.
MLF: N Specifies restricted maximum likelihood.
HYPOTH:N Disables some optional hypothesis testing procedures.
Co
FIXTAU:3 Alternative options for generating starting values.
CONSTRAIN:N Estimates a model with constrained level-2 coefficients.
OUTPUT:HSB1.OUT File where HLM2 output will be saved.
FULLOUTPUT: Y Controls amount of output in output file.
TITLE: Intercept and Slopes-as-Outcome Model Title on page 1 of output.
An user can rename the file with or without modification with a plain text (ASCII) editor for
subsequent batch-mode application. For instance, he or she may request the program to print out all
py
the level-1 OLS regressions by changing the LEV1OLS:10 to LEV1OLS:160 and rename the file to
HSB2.MLM. The user can execute the analysis by typing:
at the system prompt. As the run is fully specified in the command file HSB2.MLM, no questions will
rig
come to the screen during its execution. This is full batch mode. The user may choose a fully
interactive execution mode or an execution mode that is partly interactive and partly batch. With
partly interactive, partly batch mode, some specification occurs in the command file; the program
prompts the user with questions for the remaining program features. Some users may find this a
useful way to suppress some the questions relating to less often used features of the programs. Fully
interactive mode is invoked when one of the programs is invoked without a second argument, i.e.,
ht
HLM2 HSB.MDM
In this case, all of the possible questions will be asked with the exception relating to type of
estimation used. (mlf:y must be specified in the command file).
©
A.4 Using HLM2 in batch mode
A command file consists of a series of lines. Each line begins with a keyword followed by a colon,
after the colon is the option chosen by the user, i.e.,
KEYWORD:OPTION
SS
For example, HLM2 provides several optional hypothesis-testing procedures, described in detail in
the Sections 2.9.2 to 2.9.4. Suppose the user does not wish to use these optional procedures in a
given analysis. Then the following line would be included in the command file:
I
HYPOTH:N
The keyword HYPOTH concerns the optional hypothesis testing procedures; the option chosen, 'N',
indicates that the user does not wish to employ these procedures. Alternatively, the user might
include the line:
292
HYPOTH:Y
This prompts HLM2 to activate the optional hypothesis testing menu during model specification in
the interactive mode. Lines beginning with a pound (#; also called hash mark) are ignored and may
be used to put comments in the command file.
HLM2, by default, has set up the following options unless the user specifies an alternative command
file.
Co
STOPVAL:0.0000010000 Sets convergence criterion to be 0.000001.
ACCEL:5 Use accelerator once after five iterations.
FIXTAU:3 Use the "standard" computer-generated values for the variances and covariances.
MLF:N Use the restricted maximum likelihood approach.
Table A.1 presents the list of keywords and options recognized by HLM2. Examples with detailed
explanation follow.
py
Table A.1 Keywords and options for the HLM2 command file
The program will prompt the user interactively to set the constraints. Alternatively,
SS
constraints can be set in the command file. For example, suppose the following
coefficients were estimated: γ 01 , γ 11 , γ 2 0 , γ 21 and we wish to specify γ 2 0 = γ 21 , we add the
following command line: CONSTRAIN: 0,0,1,1.
For the following coefficients: γ 0 0 , γ 01 , γ 0 2 , γ 10 , γ 11 , γ 12 , the command line: CONSTRAIN: 0,1,2,0,1,2
Note that all coefficients sharing the value "0" are free to be estimated independently.
293
Table A.1 Keywords and options for the HLM2 command file (continued)
Each contrast is specified by its own line in the command file. The contrast associated with
the first hypothesis is specified with the keyword GAMMA1. For example, the contrast shown
in Fig 2.37 can be specified by adding the following lines:
GAMMA1:0.0,1.0,0.0,0.0,0.0,0.0
GAMMA1:0.0,0.0,0.0,0.0,1.0,0.0
For the second hypothesis, the keyword is GAMMA2 and for the third it is GAMMA3 (See Section
py
2.9.2 for further discussion and illustration.)
294
Table A.1 Keywords and options for the HLM2 command file (continued)
N No
Controls maximum
MLF likelihood estimation Yes, full maximum likelihood.
Y
method
Produces standard errors of T and σ
2
N No
Yes – this may be followed by two ‘/'s denoting the
Y/[vl1]/[vl2]
Co
Create level-1 residual two levels that can be in the residual file. By default,
RESFIL1 file all the variables in the model will be present in the
residual file, this can be added to put additional
variables. Vl1 and vl2 are lists of comma-separated
variables
Y Yes
BERNOULLI
POISSON
BINOMIAL, COUNTVAR These options are explained in detail in
NONLIN Selects a nonlinear analysis POISSON, COUNTVAR Chapter 8.
MULTINOMIAL,
SS
COUNTVAR
ORDINAL, COUNTVAR
Maximum number of macro
MACROIT POSITIVE INTEGER Used in non-linear models
iterations
Maximum number of micro
MICROIT POSITIVE INTEGER Used in non-linear models
iterations
Convergence criterion for
STOPMACRO change in parameters across POSITIVE INTEGER
I
macro iterations
295
Table A.1 Keywords and options for the HLM2 command file (continued)
The following gives a description of the files containing critical statistics and their variances that are
provided by the program upon request.
©
Let r = number of random effects at level-1.
f = number of fixed effects
p = number of outcomes in a latent variable run
pm = number of alphas in a latent variable run
SS
1. For HLM2:
• TAUVC.DAT contains tau in r columns of r rows and then the inverse of the information matrix (the
standard errors of tau are the square roots of the diagonals). The dimensions of this matrix are
r ∗ (r + 1) / 2 × r ∗ (r + 1) / 2 .
I
• GAMVC.DAT contains the gammas and the gamma variance-covariance matrix. After the gammas,
there are f more rows of f entries containing the variance-covariance matrix.
• GAMVCR.DAT contains the gamma and the gamma variance-covariance matrix used to compute the
robust standard errors. After the gammas, there are f rows of f entries containing the
variance-covariance matrix.
296
2. For HGLM:
• TAUVC.DAT contains tau for the final unit-specific results in r columns of r rows and then the inverse
of the information matrix (the standard errors of tau are the square roots of the diagonals). The
dimensions of this matrix are r ∗ (r + 1) / 2 × r ∗ (r + 1) / 2 .
• GAMVCUS.DAT contains the final unit-specific gammas and the gamma variance-covariance matrix.
Co
The gammas are in the first line and this line has f entries. Then there are f more rows of f entries
containing the variance and covariance matrix.
• GAMVCPA.DAT contains the final unit-specific gammas and the gamma variance-covariance matrix.
The gammas are in the first line and this line has f entries. Then there are f more rows of f entries
containing the variance and covariance matrix.
• GAMVCPAR.DAT contains the final unit-specific gammas and the gamma variance-covariance matrix
used to compute the population-averaged robust standard errors. The gammas are in the first line and
py
this line has f entries. Then there are f more rows of f entries containing the variance and covariance
matrix.
All of the above files are created with an n(F15.7 1X) format. That is, each entry is fifteen characters
wide with even decimal places, followed by a space (blank character).
SS
If the value of r or r ∗ (r + 1) / 2 exceeds 60, the line is split into two or more pieces.
The first option in the basic HLM2 menu "Examine means, variances, chi-squared, etc.?" provides a
variety of statistics useful as we begin to formulate HLM problems. The use of this option is available
only in interactive mode and a description of the output appears below.
HLM2 HSB.MDM
297
SPECIFYING A LEVEL-1 OUTCOME VARIABLE
Please specify a level-1 outcome variable
ANOVA
mean estimate of
potential univariate variance in
level-1 regression regression
predictors coefficient coefficient reliability chi-squared j
rig
MEANS 12.74785 8.76642 0.90192 1618.70998 160
MINORITY -2.72109 5.60227 0.30517 228.41290 136
FEMALE -0.94302 0.27533 0.05944 143.41090 123
SES 2.10355 0.46634 0.17531 212.38315 160
These data provide our first information about which level-1 coefficients might be specified as
SS
random. Though very preliminary results, they suggest that both MINORITY and SES coefficients
might be specified as random. Note, these are just univariate regression coefficients and are not
adjusted for any other level-1 effects as they would be in a full level-1 model.
Hit return to continue < HRt >
Next, we select the level-2 predictors that might be used to model the means and univariate
regression slopes.
299
Level-2 Level-1 univariate coefficients
Predictors MEANS MINORITY FEMALE SES
SIZE -0.0982 -0.1862 -0.2591 0.2008
SECTOR 0.4492 0.3680 0.1199 -0.3977
PRACAD 0.6821 0.2152 0.0612 -0.2017
DISCLIM -0.4678 -0.4343 -0.1038 0.3355
HIMINTY -0.3752 0.0790 -0.0004 -0.1964
MEANSES 0.7847 0.0626 0.0576 0.0496
300
B Using HLM3 in Interactive and Batch Mode
This appendix describes and illustrates how to use HLM3 in interactive and batch mode to construct
MDM files, and to execute analyses based on the MDM file. It also lists and defines command
Co
keywords and options unique to HLM3. References are made to appropriate sections in the manual
where the procedures are described in greater details.
As in the case of HLM2, formulation, estimation, and testing of models using HLM3 in several ways:
Windows mode (PC users only), interactive mode, or batch mode. Interactive execution guides the
user through the steps of the analysis by posing questions and providing a menu of options.
However, batch mode can be considerably faster once the user becomes skilled in working with the
py
program. In between the two extremes – fully interactive and fully batch – is a range of execution
modes that are partly interactive and partly batch. The degree to which the execution is automated
(via batch mode) is controlled by the command file, as in the case of HLM2.
What variable
Note: there are two linking ID's in the level-1 data file.
301
Please specify level-1 variable # 3 (enter 0 to end): 5
Please specify level-1 variable # 4 (enter 0 to end): 6
2
rig
Please specify level-3 variable # 1 (enter 0 to end):
Please specify level-3 variable # 2 (enter 0 to end): 3
Please specify level-3 variable # 3 (enter 0 to end): 4
Note: had we indicated that missing data were present in the level-1 file, the following additional
ht
prompts would have come to the screen:
Do you want to delete the missing data now, or at analysis time? Now
(Enter "now" or "analysis")
©
See Section 2.6 on how HLM2 handles missing data.
After the MDM file is computed, descriptive statistics for each file are sent to the screen. It is
important to examine these carefully to guarantee that no errors were made in specifying the
SS
format of the data. HLM3 will save these statistics in a file named HLM3MDM.STS. These results are
helpful as a reference and when constructing a descriptive table about the data for a written report.
B.1.2 Example: constructing an MDM file for the HS&B data using ASCII file
input
I
C:\HLM> HLM3
Will you be starting with raw data? Enter type of raw data:
for ASCII input enter 1
for SYSTAT .SYS file enter 2
for SAS V5 transport file enter 3
for SPSS file (UNIX or windows) enter 4
302
for anything DBMSCOPY reads enter 5
Type? 1
HLM3 automatically creates a file named CREATMDM.MDMT that lists the stream of responses typed
by the user while creating the MDM file. The CREATMDM.MDMT file has several uses. It can help the
©
user discover errors in the format or variable name specification. Once these are identified,
CREATMDM.MDMT can be copied, for example, to NEWSS.MDMT, and then edited. Alternatively, if
the user wishes to delete or add variables, the copy of CREATMDM.MDMT can be edited. To
reconstruct the MDM file using this new set of commands, simply type:
SS
HLM3 -R NEWSS.MDMT
As in HLM2, the first argument, HLM3, tells the computer to execute the three-level HLM program; the
I
second argument specifies the MDM file to be analyzed. An optional third argument specifies a
command file that can be used to automate aspects of model specification via batch-mode.
303
Please specify a level-1 outcome variable
The choices are:
For YEAR enter 1 For GRADE enter 2 For MATH enter 3
For RETAINED enter 4
Do you want to set the level-2 intercept to zero for YEAR, P1? N
effect to zero? N
If you answer "Y" here, HLM3 will allow you to fix one or more level-2 variances (and associated
covariances) to zero. Through this process the corresponding level-2 outcome is specified as fixed
(no predictors) or non-randomly varying (some predictors included.) Notice that the model above
contains no level-2 predictors. Had level-2 predictors been included, the user would have been
I
prompted about possible centering options. The choices are: centering around the group mean, X .k ,
centering around the grand mean, X .. , or no centering.
As in HLM2, HLM3 will interpret the response of "-1" to repeat the selections made for the previous
prompt, i.e., 1, 2, 3.
Select the level-3 predictor variables that you might consider for
inclusion as predictors in subsequent models.
©
The choices are:
For SIZE enter 1 For LOWINC enter 2 For MOBILITY enter 3
The options available here are a multivariate hypothesis test for the fixed effects and Likelihood
Ratio Test for comparison of nested models.
305
OUTPUT SPECIFICATION
KEYWORD:OPTION
As with HLM2, a pound sign ("#") as the first character of a line can be used to introduce a comment
SS
306
The following keywords are available only for HLM2:
307
Table B.1 Keywords and options unique to the HLM3 command file (continued)
Performs a No
latent N
LVR-BETA P for predictor(s); O for outcomes (s)
variable
P,O See Section 11.1 for details.
regression
Turns on/off Y
Use Fisher
DOFISHER Fisher N
Do not use Fisher
estimation
Co
0 Same as DOFISHER:N
Controls type
1 Use 1st derivate Fisher
FISHERTYPE of Fisher
2 Use 2nd derivative Fisher(default)
acceleration
See section 4.5.
PRINTVARIANCE-COVARIANCE:Y
rig
to the command file will request HLM3 to print out statistics for both tau(pi) as well as tau(beta).
TAUVC.DAT contains tau (tau(pi)) in r columns of r rows, the next r2 lines are the tau(beta), and then
the inverse of the information matrix (the standard errors of tau[s] are the square roots of the
diagonals). The dimensions of this matrix are
(r ∗ (r + 1) / 2 + r 2 ∗ (r 2 + 1) / 2) × (r ∗ (r + 1) / 2 + r 2 ∗ (r 2 + 1) / 2) .
©
TAUVC.DAT has the same format as the one for HLM3. The tau(s) are the final unit-specific results.
SS
The files for the gammas have the identical structure as those for two-level models (see Section A.5)
All files are created with an n(F15.7,1X) format. That is, each entry is fifteen characters wide with
seven decimal places, followed by a space (blank character).
I
If the value of r or f or (r ∗ (r + 1) / 2 + r 2 ∗ (r 2 + 1) / 2 exceeds 60, the line is split into two or more
pieces.
308
C Using HLM4 in Batch Mode
Unlike the older modules (HLM2, HLM3, etc.), HLM4 does not have interactive modes to create the
MDM or specify a model. If the windows interface is not available, these file must be created with an
ASCII editor and submit them to obtain results.
The file is broken into two sections. The first is to declare the filenames of the raw data and other
characteristics of the MDM file to be made, the second chooses the variables to be included at the
various levels. Below is the first part with explanation in parentheses:
I
The second part of the mdmt file specifies which variables are ID variables, and which ones go into
the mdm file as possible analysis variables. The structure looks like this:
py
*begin l1vars
level4id:SCHID
level3id:TCHRID
level2id:OCCASID
[list of level-1 variables, one per line]
*end l1vars
rig
*begin l2vars
level4id:SCHID
level3id:TCHRID
level2id:OCCASID
[list of level-2 variables, one per line]
*end l2vars
*begin l3vars
level4id:SCHLID
ht
level3id:TCHRID
[list of level-3 variables, one per line]
*end l3vars
*begin l4vars
level4id:SCHID
[list of level-4 variables, one per line]
*end l4vars
©
The IDs must be specified in the order shown, and must all be of the same type, either numeric
(preferable) or alphanumeric(not advised).
Once the mdmt file is created, the file must be submitted to HLM4:
SS
C:\HLM> HLM4 –r literacy.mdmt
The results on the screen should then be examined to make sure the data were read correctly. These
descriptive statistics will also be contained in a file named HLM4MDM.STS.
I
310
C.2 Example: Creating an HLM file and running the model
The next step is to create a file that specifies the desired model. (This is usually suffixed with a .hlm)
For example, we will use the model shown in section 6.2.
nonlin:n
numit:100
Co
stopval:0.0000010000
level1:EXPERTIS=STDERR+RANDOM
level2:STDERR=INTRCPT2+OCCASION+ARTIFACT+random
level3:INTRCPT2=INTRCPT3+random
level4:INTRCPT3=INTRCPT4+CHGCOACH+random
level3:OCCASION=INTRCPT3+random
level4:INTRCPT3=INTRCPT4+CHGCOACH+random
level3:ARTIFACT=INTRCPT3
level4:INTRCPT3=INTRCPT4+CHGCOACH+random
py
fixsigma2:1.000000
fixtaupi:3
fixtaubeta:3
fixtaugamma:3
accel:5
level1weight:none
level2weight:none
rig
level3weight:none
level4weight:none
hypoth:n
resfil1:n
resfil2:n
resfil3:n
resfil4:n
title:Unconditional model for literacy program
ht
output: literacy1.txt
fulloutput:y
The above is very similar to an HLM3 model file, with the exception of the model specification at the
top where an extra level is shown. Here is the model part that better demonstrates the nested nature
of the model specification (the shown indentation will not run):
©
level1:EXPERTIS=STDERR+RANDOM
level2:STDERR=INTRCPT2+OCCASION+ARTIFACT+random
level3:INTRCPT2=INTRCPT3+random
level4:INTRCPT3=INTRCPT4+CHGCOACH+random
level3:OCCASION=INTRCPT3+random
level4:INTRCPT3=INTRCPT4+CHGCOACH+random
SS
level3:ARTIFACT=INTRCPT3
level4:INTRCPT3=INTRCPT4+CHGCOACH+random
The basic rule here is that for each level-1 variable in the model, there needs to be a level-2: line, for
each level-2 variable, a level-3 file, and for each level-3 variable, a level-4 line. The order is not
arbitrary and must follow the pattern above.
I
Assuming that the above file is named literacy1.hlm, then the following command should be run:
311
Given the HLM file above the output would then be in literacy1.txt. Note that if html output is desired,
a .html suffix should be specified on the output: line rather than .txt.
Table C.1 presents the list of keywords and options unique to HLM4 relative to HLM3.
Table C.1 Keywords and options unique to the HLM4 command file
Co
Keyword Function Option Definition
INTRCPT1 Level-1 intercept
+VARNAME Level-1 predictor (no centering)
+VARNAME,2 Level-1 predictor centered around level-2 mean
+VARNAME,3 Level-1 predictor centered around level-3 mean
+VARNAME,4
Level-1 model Level1 predictor centered around level-4 mean
LEVEL1 specification
+VARNAME,G Level-1 predictor centered around grand mean a...
py
(Note: variable names may
be specified in either upper
or lower case.)
INTRCPT2 Level-2 intercept
+VARNAME Level-2 predictor (no centering)
+VARNAME,3 Level-2 predictor centered around level-3 mean
Level-2 model
LEVEL2 +VARNAME,4 Level-2 predictor centered around level-4 mean
specification
rig
+VARNAME,G Level-2 predictor centered around grand mean
The command file structure for HLM3 closely parallels that of HLM2. Each line begins with a
keyword followed by a colon. After the colon is the option chosen by the user, i.e.,
I
KEYWORD:OPTION
As with HLM2, a pound sign ("#") as the first character of a line can be used to introduce a comment
into the command file.
The following keywords have the same definitions and options in HLM3 as in HLM2 (Table A.1)
312
ACCEL CONSTRAIN DEVIANCE DF MACROIT MICROIT NONLIN NUMIT
OUTPUT RESFIL1 RESFIL1NAME RESFIL2 RESFIL2NAME RESFILTYPE FIXSIGMA2
STOPMACRO STOPMICRO STOPVAL TITLE LEVEL1DELETION FULLOUTPUT
Co
py
rig
ht
©
SS
I
313
D Using HGLM in Interactive and Batch Mode
This appendix describes and illustrates how to use HGLM in interactive and batch mode to execute
Co
analyses based on the MDM files. References are made to appropriate sections in the manual where
the procedures are described in greater details.
1) Bernoulli (0 or 1)
2) Binomial (count)
3) Poisson (constant exposure)
ht
4) Poisson (variable exposure)
5) Multinomial
6) Ordinal
type of analysis: 1
As mentioned, with one binary outcome per level-1 unit, the model choice is "1" (Bernoulli).
©
If "2"(Binomial) is chosen, the user will be asked:
For the non-linear analysis, which variable indicates the number of trials?
SS
If "4"(Poisson (variable exposure)) is chosen, the user will be asked:
314
Specifying 25 macro iterations sets an upper limit; if, after the 25th iteration the algorithm has not
converged. The program will nonetheless terminate and print the results at that iteration. Similarly,
setting 20 as the number of micro iterations insures that, after 20 micro iterations, the current
macro iteration will terminate even if the micro iteration convergence criterion has not been met.
An answer of "Y" here allows a user to estimate a level-1 dispersion parameter σ 2 . If the
assumption of no dispersion holds, σ 2 = 1.0. If the data are over-dispersed, σ 2 > 1.0; if the data
Co
are under-dispersed, σ 2 < 1.0.
An answer of "Y" here allows use to obtain highly accurate Laplace approximation to maximum
likelihood. See Sections 7.6.3 and 8.9.2. The user will be prompted to enter maximum number of
py
Laplace macro iterations.
Thus, we have set up a level-1 model with repetition (REP1) as the outcome and with gender (MALE)
SS
315
Level-2 predictor? (Enter 0 to end) 1
Which level-2 predictors to model MALE slope?
Level-2 predictor? (Enter 0 to end) 0
Which level-2 predictors to model PPED slope?
Level-2 predictor? (Enter 0 to end) 0
Thus we have modeled the level-1 intercept as depending on the mean SES (MSESC) of the school.
The coefficients associated with gender and pre-primary experience are fixed. Mean SES has been
centered around its grand mean.
Co
Do you want to constrain the variances in any of the level-2 random
effects to zero? Y
Do you want to fix INTRCPT1? N
Do you want to fix MALE? Y
Do you want to fix PPED? Y
OUTPUT SPECIFICATION
©
Do you want a level-1 residual file? Y
316
Enter additional variables to go in residual file
The choices are:
Level-1 variable? (Enter 0 to end) 1
For MSESC enter 1
Do you want to see OLS estimates for all of the level-2 units? N
Enter a problem title: Bernoulli output, Thailand data
Enter name of output file: THAIBERN.OUT
py
MACRO ITERATION 1
Macro iteration number 1 has converged after six micro iterations. This macro iteration actually
computes the linear-model estimates (using the identity link function as if the level-1 errors were
ht
assumed normal).
These results are then transformed and input to start macro iteration 2, which is, in fact, the first
non-linear iteration.
MACRO ITERATION 2
©
Starting values computed. Iterations begun.
Should you wish to terminate the iterations prior to convergence, enter cntl-c
The value of the likelihood function at iteration 1 = -1.067218E+004
The value of the likelihood function at iteration 2 = -1.013726E+004
The value of the likelihood function at iteration 3 = -1.011008E+004
The value of the likelihood function at iteration 4 = -1.010428E+004
The value of the likelihood function at iteration 5 = -1.010265E+004
SS
The value of the likelihood function at iteration 6 = -1.010193E+004
The value of the likelihood function at iteration 7 = -1.010188E+004
The value of the likelihood function at iteration 8 = -1.010188E+004
The value of the likelihood function at iteration 9 = -1.010187E+004
The value of the likelihood function at iteration 10 = -1.010187E+004
The value of the likelihood function at iteration 11 = -1.010187E+004
The value of the likelihood function at iteration 12 = -1.010187E+004
I
317
Macro interaction 2, the first non-linear macro iteration, converged after twelve micro iterations.
MACRO ITERATION 3
MACRO ITERATION 4
MACRO ITERATION 7
©
Starting values computed. Iterations begun.
Should you wish to terminate the iterations prior to convergence, enter cntl-c
The value of the likelihood function at iteration 1 = -1.000375E+004
The value of the likelihood function at iteration 2 = -1.000375E+004
Note that macro iteration 7 converged with just 2 micro iterations. Also, the change in parameter
estimates between macro iterations 6 and 7 was found negligible (less than the criterion for
SS
convergence) so that macro iteration 8 was the final "unit-specific" macro iteration. One final
"population average" iteration is computed, and screen output for that is given below.
MACRO ITERATION 8
Thus concludes the interactive terminal session. See Section 8.2 for an annotated output for this run.
The interactive session annotated above produced the following command file (NEWCMD.HLM).
318
#This command file was run with thaiugr.mdm
STOPMICRO:0.0000010000
STOPMACRO:0.0001000000
MACROIT:25
MICROIT:20
NONLIN:BERNOULLI
LAPLACE:n,50
LAPLACE8:n,50
LEVEL1:REP1=INTRCPT1+MALE+PPED+RANDOM
Co
LEVEL2:INTRCPT1=INTRCPT2+MSESC,2+RANDOM/
LEVEL2:MALE=INTRCPT2/
LEVEL2:PPED=INTRCPT2/
LEVEL1WEIGHT:NONE
LEVEL2WEIGHT:NONE
RESFILTYPE:SPSS
RESFIL1:Y/MALE,PPED,REP1/MSESC
RESFIL1NAME:resfil1.sav
RESFIL2:Y/MSESC
py
RESFIL2NAME:resfil2.sav
HETEROL1VAR:n
ACCEL:5
LVR:P
LEV1OLS:10
MLF:y
HYPOTH:n
FIXSIGMA2:1.000000
rig
FIXTAU:3
CONSTRAIN:N
OUTPUT:nnn
FULLOUTPUT:Y
TITLE:Bernoulli output, Thailand data
the output above would be reproduced. It is a good idea to rename the NEWCMD.HLM file if it is to be
edited and re-used. Each execution of the program will produce a NEWCMD.MLM file that will
overwrite the old one.
©
Note that the "NEWCMD.HLM" file above is similar to the same file produced by a linear-model
analysis, with the addition of the following lines:
See Tables A.1 and B.1 for a description of the keywords and options.
I
319
E Using HMLM in Interactive and Batch Mode
This appendix describes prompts and commands for creating MDM files and executing analyses
based on the MDM files. References are made to appropriate sections in the manual where the
Co
procedures are described in greater details. To start HMLM or HMLM2, type HMLM or HMLM2 at the
system prompt.
1) Unrestricted
2) Random effects model with homogeneous level-1 variance
3) Random effects model with heterogeneous level-1 variance
4) Random effects model with log-linear model for level-1 variance
5) Random effects model with first-order autoregressive level-1 variance
type of analysis: 3
320
For choices 2 to 5, the user will be prompted.
If "4"(log-linear model for level-1 variance) is chosen, HMLM will ask the user to enter variables to
model sigma 2 , for example:
Should VAR1 be in C?
Co
An interactive session will output a command file NEWCMD.MLM. An example for one of the
analyses discussed in Section 10.2 is given below.
The following keywords have the same definitions and options in HMLM as in HLM2 (Table A.1)
©
ACCEL DEVIANCE DF FIXTAU
LEVEL1 LEVEL2 GAMMA# HYPOTH
NUMIT OUTPUT PRINTVARIANCE-COVARIANCE
STOPVAL TITLE FULLOUTPUT LVR
321
E.2.1 Table of keywords and options
Table E.1 Keywords and options unique to the HMLM command file
The following keywords have the same definitions and options in HMLM2 as in HLM3 (Table B.1)
rig
ACCEL DEVIANCE DF FIXTAU2 FIXTAU3
LEVEL1 LEVEL2 LEVEL3 GAMMA# HYPOTH
NUMIT OUTPUT PRINTVARIANCE-COVARIANCE
STOPVAL TITLE FULLOUTPUT LVR
Table E.1 Keywords and options unique to the HMLM2 command file
322
Note that HMLM and HMLM2 do not allow non-linear outcomes, use of plausible values and multiply-
imputed values, constraints of gammas, and they do not write out any residual files.
Co
py
rig
ht
©
SS
I
323
F Using Special Features in Interactive and Batch Mode
This appendix describes and illustrates how to use the special features in interactive and batch mode
Co
to execute analyses. References are made to appropriate sections in the manual where the procedures
are described in greater details.
F.1 Example: Latent variable analysis using the National Youth Study data
sets
The following interactive session illustrates a latent variable analysis example using the National
py
Youth Study (NYS) data sets. A description of the data files and the model specification can be found
in Sections 10.1.1 and 11.1.1.
type of analysis: 2
ht
We select the homogeneous level-1 variance option for this model. Thus, using HLM2 will yield
identical results in this case.
INTRCPT1, the level of tolerance at age 11, is used as a predictor to model the outcome, AGE11, the
linear growth rate. Note that INTRCPT1 and AGE11 are latent variables, that is, they are free of
measurement error.
Do you want to specify a multivariate hypothesis for the fixed effects? N
©
OUTPUT SPECIFICATION
How many iterations do you want to do? 50
Enter a problem title: Latent variable regression, NYS Data
Enter name of output file: NYS2.OUT
325
1) Unrestricted
2) Random effects model with homogeneous level-1 variance
3) Random effects model with heterogeneous level-1 variance
4) Random effects model with log-linear model for level-1 variance
5) Random effects model with first-order autoregressive level-1 variance
type of analysis: 1
326
ADDITIONAL PROGRAM FEATURES
To analyze data with multiply-imputed values for the outcome and/or covariates, the user needs to
ht
prepare multiple MDM files. After setting up the multiple MDM files, the user have to submit the
command files to HLM2 and HLM3 as many times as the number of multiple MDM files with an extra
flag, -MI#, where # is the sequence number, starting from 0. On the last run, you also need the -E
flag, (E for estimate).
Suppose there are 4 sets of multiply-imputed data for a two-level model, called MDATA1.MDM,
©
MDATA2.MDM, MDATA3.MDM, and MDATA4.MDM and the command file is ANALYSE.MLM; the
following commands need to be typed in at the system prompt:
dependence:
dospatialcorrelation:y
327
G Using HCM2 in Interactive and Batch Mode
This appendix describes and illustrates how to use HCM2 in interactive construct MDM files, and in
both interactive and batch mode to execute analyses based on the MDM file. It also lists and defines
Co
command keywords and options unique to HCM2. References are made to appropriate sections in the
manual where the procedures are described in greater details. In the next Section, we show the
construction of an MDM file using the educational attainment data as described in Chapter 13.
Note there are two linking ID's in the level-1 or within-cell file.
I
328
Please specify level-1 variable # 5 (enter 0 to end): 7
Please specify level-1 variable # 6 (enter 0 to end): 8
Please specify level-1 variable # 7 (enter 0 to end): 9
Please specify level-1 variable # 8 (enter 0 to end): 10
HCM2 save send the descriptive statistics of variables for each file to the screen. It is important to
examine these carefully to ensure that no errors were made. The program will save these statistics
in a file name HCM2MDM.STS.
ht
LEVEL-1 DESCRIPTIVE STATISTICS
329
G.1.2 Example: Executing an unconditional model analysis using
ATTAIN.MDM
C:\HLM> HCM2 ATTAIN.MDM
We shall model educational attainment with an unconditional model and specific the residual row,
column, and cell-specific effects as random. See Section 13.2 .
py
SPECIFYING AN HCM2 MODEL
330
Enter type of deflection:
for independent(default) enter 1
for cumulative enter 2
Type? 1
OUTPUT SPECIFICATION
See Section 13.2 for a discussion of the results of this unconditional model.
©
G.1.3 Example: Executing a conditional model analysis using ATTAIN.MDM
C:\HLM> HCM2 ATTAIN.MDM
We shall model educational attainment with all the level-1 predictor variables. All the level-1
coefficients associated with the predictors are fixed. See Section 13.3.
331
SPECIFYING AN HCM2 MODEL
y
py
Do you want to center any level-1 predictors?
Enter 0 for no centering, 2 for grand-mean
How do you want to center P7VRQ? 2
How do you want to center P7READ? 2
How do you want to center DADOCC? 2
How do you want to center DADUNEMP? 2
How do you want to center DADED? 2
rig
How do you want to center MOMED? 2
How do you want to center MALE? 2
332
The choices are:
We shall treat the association between social deprivation and educational attainment as fixed across
all schools. See Section 13.2.
SS
333
OUTPUT SPECIFICATION
the result will be the output for the unconditional model. Note that each execution of the program
will produce a NEWCMD.HLM file that will overwrite the old one.
I
334
MALE,2+RANDOM
ROWCOL:INTRCPT1=THETA+DEPRIVE(FIXED),2+RANDOMB+RANDOMC
ROWCOL:P7VRQ=THETA
ROWCOL:P7READ=THETA
ROWCOL:DADOCC=THETA
ROWCOL:DADUNEMP=THETA
ROWCOL:DADED=THETA
ROWCOL:MOMED=THETA
ROWCOL:MALE=THETA
FIXTAU:3
FIXDELTA:3
Co
ACCEL:5
DEFLECTION:INDEPENDENT
TITLE:CONDITIONAL MODEL, WITH SOCIAL DEPRIVATION EFFECT FIXED
OUTPUT:C:\HLM\ATTAIN2.TXT
FULLOUTPUT:N
The following keywords in the above command files have the same definition and options in HCM2
as in the other modules (e.g. Tables A.1 and B.1)
py
ACCEL FULLOUTPUT FIXTAU NONLIN NUMIT OUTPUT STOPVAL TITLE
FIXSIGMA2 STOPMICRO STOPMACRO DEVIANCE DF GAMMA
Had we requested residual level-1, and row and column files during the interaction session, the
command files would contain the following additional command lines specifying the type (SPSS
rig
system file) and the names for each of the files (RESFIL1.SAV, RESROW.SAV, and RESCOL.SAV):
RESFILTYPE:SPSS
RESFIL1NAME:RESFIL1.SAV
RESFIL1:Y
ht
RESROWNAME:RESROW.SAV
RESROW:Y
RESCOLNAME:RESCOL.SAV
RESCOL:Y
Table G.1 Keywords and options unique for HCM2 command file
©
Keyword Function Option Definition
335
H Using HCM3 in Batch Mode
Unlike the older modules (HLM2, HLM3, etc.), HCM3 does not have interactive modes to create the
MDM or specify a model. If the windows interface is not available, these file must be created with an
ASCII editor and submit them to obtain results.
rawdattype:spss (This declares the type of input data. Possible values are
spss, sas (version 5 transport file), stata, and ascii)
I
l1fname:growth.sav (The next four lines declare the names and locations of
the four input files; level-1, row, column, and cluster,
respectively.)
rowfname:student.sav
colfname:teacher.sav
clusfname:school.sav
l1missing:n (This declares whether or not there are missing data at level-
1. Possible values are n for not missing, or y for missing
336
data present.)
timeofdeletion:now (This may be n[ow] , where all level-1 cases with missing data
on selected variables will be deleted, or a[nalysis] where
the missing data will be left in and deleted at run-time
based on the model specified.)
The second part of the mdmt file specifies which variables are ID variables, and which ones go into
the mdm file as possible analysis variables. The structure looks like this:
Co
*begin l1vars
rowid:STUDID
colid:TCHRID
clusid:SCHLID
MATH
YEAR
G4D1
py
G4D21
G5D22
TWOWAY
*end l1vars
*begin rowvars
rowid:STUDID
DUMMY
rig
*end rowvars
*begin colvars
colid:TCHRID
clusid:SCHLID
DUMMY
*end colvars
*begin clusvars
clusid:SCHLID
ht
DUMMY
*end clusvars
The ID s must be specified in the order shown, and must all be of the same type, either numeric
(preferable) or alphanumeric(not advised). The level-1 file needs to be sorted primarily by row ID,
secondarily by cluster ID, and thirdly and the column level. The row file should be sorted by row ID.
©
The column file should be sorted by column ID within cluster ID, and the cluster file sorted by cluster
ID.
Once the mdmt file is created, the file must be submitted to HCM3:
The results on the screen should then be examined to make sure the data were read correctly. These
descriptive statistics will also be contained in a file named HCM3MDM.STS.
H.2 Example: Creating an HCM3 HLM file and running the model
The next step is to create a file that specifies the desired model. (This is usually suffixed with a .hlm)
I
nonlin:n
numit:100
stopval:0.0000010000
level1:MATH=INTRCPT1+YEAR+G4D1+G4D21+G5D22+TWOWAY+RANDOM
rowcol:INTRCPT1=theta+randomb+randomc
337
clus:theta=ICPTCLUS+randomd
rowcol:YEAR=theta+randomb+randomc
clus:theta=ICPTCLUS+randomd
rowcol:G4D1=theta
clus:theta=ICPTCLUS
rowcol:G4D21=theta
clus:theta=ICPTCLUS
rowcol:G5D22=theta
clus:theta=ICPTCLUS
rowcol:TWOWAY=theta
clus:theta=ICPTCLUS
Co
fixtau:3
fixdelta1:3
fixdelta2:3
accel:5
level1weight:none
rowweight:none
clusterweight:none
hypoth:n
resfiltype:spss
py
resfil1:n
resfil1fname:resfil1.sav
resrow:n
resrowfname:resrow.sav
rescol:n
rescolfname:rescol.sav
resclus:n
rig
resclusfname:resclus.sav
deflection:cumulative
title:Unweighted model
output:docdef1.html
fulloutput:n
The above is very similar to an HCM2 model file, with the exception of the model specification at the
ht
top where an extra level is shown. Here is the model part that better demonstrates the nested nature
of the model specification (the shown indentation will not run):
level1:MATH=INTRCPT1+YEAR+G4D1+G4D21+G5D22+TWOWAY+RANDOM
rowcol:INTRCPT1=theta+randomb+randomc
clus:theta=ICPTCLUS+randomd
rowcol:YEAR=theta+randomb+randomc
©
clus:theta=ICPTCLUS+randomd
rowcol:G4D1=theta
clus:theta=ICPTCLUS
rowcol:G4D21=theta
clus:theta=ICPTCLUS
rowcol:G5D22=theta
clus:theta=ICPTCLUS
rowcol:TWOWAY=theta
SS
The rule here is that for every variable in the level1: line, there needs to be a rowcol: line in the same
order as the variables are declared in the level1: line. For each variable in a rowcol: line, there must be
clus: line. Also, note that instead of some form of INTRCPT, HCM3 uses the special name theta to
denote the intercept in the rowcol: lines.
I
Note that a level-1 variable may vary at the row (randomb), column(randomc), or cluster(randomd)
level. A row variable may vary at either the column or cluster levels. A column variable may vary at
the row or cluster level, and cluster variable may vary at the row level. This can make for a very
complicated model specification. For example, consider this skeleton section for just the level-1
intercept where rowvar, colvar and clusvar are arbitrary row, column, and cluster level variables:
338
level1:outcome=intrcpt1+random
rowcol:intrcpt1=intrcpt1+rowvar(random)+colvar(random)+rvar*colvar+randomb+randomc
clus:intrcpt1=theta+clusvar(randomb)+randomd
clus:rowvar=theta+clusvar(randomb)
clus:colvar=theta+clusvar(fixed)+randomd
clus:rowvar*colvar=icptclus+clusvar[norandom]+randomd
In the rowcol: line, there are four variables: the intercept, an arbitrary row variable (rowvar) an
arbitrary column variable (colvar), and a row by column interaction term rowvar*colvar. The random in
Co
parentheses tells the program let the variables vary. If the variable should be fixed, substitute the
word fixed instead. The interaction term cannot vary, so there is no way to specify this. Finally, the
randomb and randomc at the end of the line tells the program to let the level-1 intercept vary across
rows and columns respectively. Either +randomb and +randomc can be omitted if the level-1 variable
should not be allowed to vary across rows or columns respectively.
The clus: lines all take on the same basic form. In this example, all the variables are modeled with a
py
cluster intercept, which is random at level-3 except for the variable rowvar, where the +randomd is
omitted. In the clus:colvar line, clusvar is fixed at the row level, where in the previous two lines it is
allowed to vary. In the row/column interaction line, clusvar has no random/fixed declaration because
this term cannot vary at any level.
rig
Assuming that the above file is named growth.hlm, then the following command should be run:
The following keywords in the above command files have the same definition and options in HCM2
as in the other modules (e.g. Tables A.1 and B.1)
ht
ACCEL FULLOUTPUT FIXTAU NONLIN NUMIT OUTPUT STOPVAL TITLE
FIXSIGMA2 STOPMICRO STOPMACRO DEVIANCE DF TITLE GAMMA RESFILTYPE
Table H.1 Keywords and options unique for HCM3 command file
©
339
Table H.1 Keywords and options unique for HCM3 command file (continued)
340
I Using HLMHCM in Batch Mode
Unlike the older modules (HLM2, HLM3, etc.), HLMHCM does not have interactive modes to create the
MDM or specify a model. If the windows interface is not available, these file must be created with an
ASCII editor and submit them to obtain results.
The file is broken into two sections. The first is to declare the filenames of the raw data and other
characteristics of the MDM file to be made, the second chooses the variables to be included at the
various levels. Below is the first part with explanation in parentheses:
SS
l2fname:student.sav
rowfname:school.sav
colfname:neigh.sav
l1missing:n (This declares whether or not there are missing data
at level-1. Possible values are n for not missing,
or y for missing data present.)
timeofdeletion:now (This may be n[ow] , where all level-1 cases with
341
missing data on selected variables will be
deleted, or a[nalysis] where the missing data will be left
in and deleted at run-time based on the model specified.)
The second part of the mdmt file specifies which variables are ID variables, and which ones go into
the mdm file as possible analysis variables. The structure looks like this:
*begin l1vars
Co
level2id:STUDID
(list of level-1 variables, one per line)
*end l1vars
*begin l2vars
level2id:STUDID
rowid:SCHID
colid:NEIGHID
(list of level-2 variables, one per line)
*end l2vars
py
*begin rowvars
rowid:SCHID
(list of row variables, one per line)
*end rowvars
*begin colvars
colid:NEIGHID
(list of column variables, one per line)
rig
*end colvars
The IDs must be specified in the order shown, and must all be of the same type, either numeric
(preferable) or alphanumeric(not advised).
Once the MDMT file is created, the file must be submitted to HLMHCM:
ht
C:\HLM> HLMHCM –r growth.mdmt
The results on the screen should then be examined to make sure the data were read correctly. These
descriptive statistics will also be contained in a file named HLMHCMMDM.STS.
©
I.2 Example: Creating an HLMHCM HLM file and running the model
The next step is to create a file that specifies the desired model. (This is usually suffixed with a .hlm)
For example, we will use the model shown in section 15.2.
SS
nonlin:n
numit:100000
stopval:0.0000010000
level1:MATH=INTRCPT1+AGE8+RANDOM
level2:INTRCPT1=INTRCPT2+BLACK+HISPANIC+random
rowcol:INTRCPT2=theta+DISADV(RANDOM)+randomb+randomc
rowcol:BLACK=theta
I
rowcol:HISPANIC=theta
level2:AGE8=INTRCPT2+BLACK+HISPANIC+random
rowcol:INTRCPT2=theta+DISADV(RANDOM)+randomb+randomc
rowcol:BLACK=theta
rowcol:HISPANIC=theta
fixtau:3
fixdelta:3
fixomega:3
342
accel:5
deviance:3800.651318
df:18
hypoth:n
resfiltype:spss
resfil1:n
resfil1fname:resfil1.sav
resfil2:n
resfil2fname:resfil2.sav
resrow:n
resrowfname:resrow.sav
Co
rescol:n
rescolfname:rescol.sav
title:CONDITIONAL LINEAR GROWTH MODEL,WITH NEIGHBORHOOD DISADVANTAGE EFFECT RANDOM
output:growth3.html
fulloutput:n
The above is very similar to an HCM2 model file, with the exception of the model specification at the
top where an extra level is shown. Here is the model part that better demonstrates the nested nature
py
of the model specification (the shown indentation will not run):
level1:MATH=INTRCPT1+AGE8+RANDOM
level2:INTRCPT1=INTRCPT2+BLACK+HISPANIC+random
rowcol:INTRCPT2=theta+DISADV(RANDOM)+randomb+randomc
rig
rowcol:BLACK=theta
rowcol:HISPANIC=theta
level2:AGE8=INTRCPT2+BLACK+HISPANIC+random
rowcol:INTRCPT2=theta+DISADV(RANDOM)+randomb+randomc
rowcol:BLACK=theta
rowcol:HISPANIC=theta
Assuming that the above file is named growth.hlm, then the following command should be run:
ht
C:\HLM> HLMHCM GROWTH.MDM GROWTH.HLM
The following keywords in the above command files have the same definition and options in HCM2
as in the other modules (e.g. Tables A.1 and B.1).
©
ACCEL FULLOUTPUT FIXTAU NONLIN NUMIT OUTPUT STOPVAL TITLE
FIXSIGMA2 STOPMICRO STOPMACRO DEVIANCE DF TITLE GAMMA RESFILTYPE
SS
I
343
J Overview of options available by module
Table J.1 below shows the options available in the HLM2, HLM3, HLM4, HMLM, HMLM2, HCM2, HCM3
and HLMHCM modules respectively.
Normal outcome Y Y Y Y Y Y Y Y
py
Bernoulli outcome Y Y Y - - Y Y Y
Poisson outcome Y Y Y - - Y Y Y
(constant exposure)
Poisson outcome Y Y Y - - Y Y Y
rig
(variable exposure)
Binomial outcome Y Y Y - - Y Y Y
Multinomial outcome Y Y - - - - - -
ht
Ordinal outcome Y Y - - - - - -
Over-dispersion Y Y Y - - Y Y Y
Title Y Y Y Y Y Y Y Y
Output filename Y Y Y Y Y Y Y Y
Graph filename Y Y Y Y Y Y - -
SS
Unrestricted - - - Y Y - - -
Skip unrestricted - - - Y Y - - -
I
Homogeneous - - - Y Y - - -
Heterogeneous - - - Y Y - - -
Log-linear - - - Y Y - - -
344
Predictor of level-1 - - - Y Y - - -
var
1-st order - - - Y Y - - -
autoregressive
Iteration Settings
Co
Number of iterations Y Y Y Y Y Y Y Y
Frequency of Y Y Y Y Y Y Y Y
accelerator
% change to stop Y Y Y Y Y Y Y Y
iterating
py
How to handle bad Y Y Y Y Y Y Y Y
variance-covariance
matrix
What to do when Y Y Y Y Y Y Y Y
convergence not
rig
reached
Mode of acceleration Y Y - - - - - -
Estimation Settings
REML
Y - - - -
ht
FML
Y Y Y Y Y Y Y Y
PQL
Y Y Y - - Y Y Y
(HGLM) (HGLM)
LaPlace6
©
Y - - - - - - -
(HGLM)
EM Laplace Y - - - - - - -
(HGLM)
Adaptive Gaussian Y Y - - - - - -
SS
Quadrature
Constraint of fixed Y Y - - - - - -
effects
Heterogeneous Y - - - - - - -
I
sigma^2
Plausible values Y Y - - - - - -
Multiple imputation Y Y - - - - - -
345
Latent variable regression Y Y - Y Y - - -
Design weighting Y Y - - - Y Y -
Spatial dependence Y - - - - - - -
Hypothesis Testing
Reduced output Y Y Y Y Y Y Y Y
ht
Print variance-covariance Y Y - Y Y - - -
matrices
Y Y - - - - - -
Exploratory Analysis (level-
2)
©
- Y - - - - - -
Exploratory Analysis (level-
3)
Model graphs Y Y Y Y Y Y - -
SS
values
346
Graph Data
box-whisker plots Y Y - - - - - -
Co
py
rig
ht
©
SS
I
347
References
Barnett, R. C., Brennan, R. T., Raudenbush, S. W., & Marshall, N. L. (1993). Gender and the
relationship between marital role-quality and psychological distress: A study of dual-earner couples.
Journal of Personality and Social Psychology, 64, 794-806.
Co
Bock, R. (1975). Multivariate Statistical Methods in Behavioral Research. New York: McGraw-Hill.
Breslow, N., & Clayton, D. (1993). Approximate inference in generalized linear mixed models. Journal
of the American Statistical Association, 88, 9-25.
py
Bryk, A., & Raudenbush, S. W. (1992). Hierarchical Linear Models for Social and Behavioral
Research: Applications and Data Analysis Methods. Newbury Park, CA: Sage.
Cheong, Y. F., Fotiu, R. P., & Raudenbush, S. W. (2001). Efficiency and robustness of alternative
estimators for 2- and 3-level models: The case of NAEP. Journal of Educational and Behavioral
rig
Statistics, 26, 411-429.
Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM
algorithm. Journal of the Royal Statistical Society, Series B(39), 1-8.
ht
Elliot, D., Huizinga, D., & Menard, S. (1989). Multiple Problem Youth: Delinquency, Substance Use,
and Mental Health Problems. New York: Springer-Verlag.
Garner, C., & Raudenbush, S. (1991). Neighborhood effects on educational attainment: A multi-level
analysis of the influence of pupil ability, family, school, and neighborhood. Sociology of Education,
©
64(4), 251-262.
Goldstein, H. (1991). Non-linear multilevel models with an application to discrete response data.
Biometrika, 78, 45-51.
SS
Hedeker, D., & Gibbons, R. (1994). A random-effects ordinal regression model for multilevel analysis.
Biometrics, 50, pp. 993-944.
Hong, G., & Raudenbush, S. W. (2008). Causal inference for time-varying instructional treatments.
Journal of Educational and Behavioral Statistics, 33, 333-362.
I
Hough, H. J., Bryk, A., Pinnell, G. S., Kerbow, D., Fountas, I., & Scharer, P. L. (2008). The effects of
school-based coaching: Measuring change in the practice of teachers engaged in literacy collaborative
professional development. Unpublished manuscript, Stanford University.
348
Huttenlocher, J.E., Haight, W., Bryk, A.S., & Seltzer, M. (1991). Early vocabulary growth: Relations to
language input and gender. Developmental Psychology, 22(2), 236-249.
Jennrich, R., & Schluchter, M. (1986). Unbalanced repeated-measures models with structured
covariance matrices. Biometrics, 42, 805-820.
Kang, S.J. (1992). A mixed linear model for unbalanced two-way crossed multilevel data with
estimation via the EM algorithm. Unpublished doctoral dissertation, Michigan State University, East
Lansing.
py
Little, R., & Rubin, D. (1987). Statistical analysis with missing data. New York: Wiley.
Little, R., & Schenker, N. (1995). Missing data. In G. Arminger, C. C. Clogg & M. E. Sobel (Eds.),
Handbook of Statistical Modeling for the Social and Behavioral Sciences (pp. 39-76). New York:
Plenum Press.
rig
Longford, N. (1993). Random Coefficient Models. Oxford: Clarendon Press.
McCullagh, P., & Nelder, J. (1989). Generalized Linear Models, 2nd Edition. London: Chapman and
ht
Hill.
Pfefferman, D., Skinner, C.J., Homes, D.J., Goldstein, H., and Rasbash, J. (1998). Weighting for
unequal selection models in multilevel models. Journal of the Royal Statistical Society, Series B, 60, 1,
23-40.
©
Pinheiro, P. C., & Bates, D. M. (1995). Approximations to the log-likelihood function in the nonlinear
mixed-effects model. Journal of Computational and Graphical Statistics, 4, 12-15.
Ramirez, D., Yuen, S., Ramey, R., & Pasta, D. (1991). The immersion study: Final Report. Washington,
DC: U.S. Office of Educational Research and Improvement.
SS
Raudenbush, S. (1993). A crossed random effects model for unbalanced data with applications in cross-
sectional and longitudinal research. Journal of Educational Statistics, 18(4), 321-349.
349
Raudenbush, S. W., & Bhumirat, C. (1992). The distribution of resources for primary education and its
consequences for educational achievement in Thailand. International Journal of Educational Research,
pp. 143-164.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis
Methods, Second Edition. Newbury Park, CA: Sage.
Co
Raudenbush, S. W., & Chan, W.S. (1993). Application of hierarchical linear models to study adolescent
deviance in an overlapping cohort design. Journal of Clinical and Consulting Psychology, 61(6), 941-
951.
Raudenbush, S.W., Yang, Meng-Li, & Yosef, M. (2000). Maximum likelihood for generalized linear
py
models with nested random effects via high-order, multivariate Laplace approximation. Journal of
Computational and Graphical Statistics, 9, 141-157.
Raudenbush, S. W., & Sampson, R. (1999). Assessing direct and indirect associations in multilevel
designs with latent variables Sociological Methods and Research, 28(2), 123-153.
rig
Rodriguez, G., & Goldman, N. (1995). An assesment of estimation procedures for multilevel models
with binary responses. Journal of the royal Statistical Society, A, 158, 73-89.
Rogers, A., & et. al. (1992). National Assessment of Educational Progress: 1990 Secondary-use Data
ht
Files User Guide. Princeton, New Jersey: Educational Testing Service.
Rowan, B., Raudenbush, S., & Cheong, Y. (1993). Teaching as a non-routine task: Implications for the
organizational design of schools. Educational Administration Quarterly, 29(4), 479-500.
©
Rowan, R., Raudenbush, & Kang, S. (1991). Organizational design in high schools: A multilevel
analysis. American Journal of Education, 99(2), 238-266.
Rubin, D. (1987). Multiple Imputation for Nonresponse in Surveys. New York: Wiley.
SS
Sampson, R., S. Raudenbush, S.W. & Earls, F. (1997). Neighborhoods and violent crime: A multilevel
study of collective efficacy. Science, 277, pp. 918-924.
Schafer, J. (1997). Analysis of Incomplete Multivariate Data. London: Chapman & Hall.
I
Schall, R. (1991). Estimation in generalized linear models with random effects. Biometrika, 40, 719-727.
Stiratelli, R., Laird, N., & Ware, J. (1984). Random effects models for serial observations with binary
response. Biometrics, 40, 961-971.
350
Verbitsky‐Savitz, N., & Raudenbush, S.W., (2009). Exploiting spatial dependence to improve
measurement of neighborhood social processes. Sociological Methodology, 39, pp 151‐183.
Wong, G., & Mason, W. (1985). The hierarchical logistic regression model for multilevel analysis.
Journal of the American Statistical Association, 80(391), 513-524.
Co
Yang, M. (1995). A simulation study for the assessment of the non-linear hierarchical model estimation
via approximate maximum likelihood. Unpublished apprenticeship paper, College of Education,
Michigan State University.
Yang, M.L. (1998). Increasing the efficiency in estimating multilevel Bernoulli models [Diss], East
Lansing, MI: Michigan State University.
py
Yosef, M. (2001). A comparison of alternative approximations to maximum likelihood estimation for
hierarchical generalized linear models: The logistic-normal model case. Unpublished doctoral
dissertation, Michigan State University, East Lansing.
rig
Zeger, S., Liang, K.Y., & Albert, P. (1988). Models for longitudinal data: A likelihood approach.
Biometrics, 44, 1049-60.
Zeger, S., & Liang, L. (1986). Longitudinal data analysis using generalized linear models. Biometrika,
73, 13-22.
ht
©
SS
I
351
Subject Index
A C
in interactive mode, 17
Fit HLM with cross-classified random effects, 235,
evaluating for model, 51, 167 238, 252, 342-344
HLM2, 186 preliminary exploratory analysis, 298
default method of estimation, 47 SSM/MDM file for V-known model, 192
example via Windows mode, 17 Intercept
interactive mode example, 289 removing at level-1, 60
keywords and options, 294-297 Intra-cell correlation, 211
preliminary exploratory analysis, 298 Iteration Control – HLM2 dialog box, 46, 121-122
HLM3, 186 Iterations
interactive and batch mode, 302 macro, 51, 112, 121, 125-127, 131, 296, 315-
keywords and options, 307-309, 313-314, 323 316, 318-320
Co
HMLM, 180 maximum number of, 47, 294
example via Windows mode, 183 micro, 112, 121, 124-126, 131, 133-134, 136-
indicator variable for, 60, 148, 152, 155, 157, 137, 141, 290, 296-297, 315-316, 318-320
171, 263, 271, 281, 321 population-average, 125, 319
interactive or batch mode, 321 unit-specific, 125, 319
keywords and options, 321
HMLM2 K
example via Windows mode, 171
py
indicator variable for, 60, 148, 152, 155, 157, Keywords
171, 263, 271, 281, 321 for HLM2, 294, 295, 296, 297
interactive or batch mode, 321 for HLM3, 308, 309, 313
keywords and options, 321 for HMLM, 321
Homogeneity for HMLM2, 321
of level-1 variance, 56, 58, 277, 295
rig
HS&B data, 17 L
example, 15, 18, 19, 27, 28, 29, 30, 61
HYPOTH keyword, 293-294 Laplace estimation, 112, 117, 120-121, 124, 145-
Hypothesis testing, 12, 54, 67, 92, 219, 224, 237, 146, 298, 316
252, 291 Latent variable analysis, 180-181, 183, 186-187,
for fixed effects, 54, 67 307, 309, 325-326
ht
Hypothesis Testing – HLM2 dialog box, 55 adjusted coefficient, 183
Hypothesis tests, 54, 120 example in interactive mode, 325-326
specifying in interactive mode, 291 example via Windows mode, 180
original coefficient, 183
I Latent Variable Regression
analysis, 48, 181-185
ID variable, 15, 17-18, 21, 23, 69, 94-95, 191, dialog box, 181-182, 184
©
225, 238-239, 302 Least squares residuals, 11
level-2, 15, 17-18, 71, 311, 338, 343 Length
Identity link function, 106, 127 ID variable, 17
Indicator variable Level-1
in HMLM, 60, 148, 152, 155, 157, 171, 263, AR(1) error structure, 147
271, 281, 321 assumption of normality, 104
SS
Input File Type Bernoulli sampling model, 106-107, 109, 112-
listbox, 20, 26, 73 113, 118, 121, 123-124, 132, 297-298, 352
Interactive mode, 17, 68, 93, 121, 123, 155, 205, binomial sampling model, 107, 115, 132
225, 238, 286, 292-293, 302, 315, 321, 325, correlated residuals, 152
329 data file, 18, 68, 94, 155
analysis of Thailand data, 315 file name, 23
creating a residual file, 291 first-order AR structure, 147, 152, 160, 169
I
example of analysis of HS&B data, 289 fitted values for coefficients, 16, 39
example of exploratory analysis, 290 fixed coefficients, 10, 65
example using HMLM, 325-326 heterogeneity of variance, 16, 49, 147, 151, 290
homogeneity of variance, 56, 58, 150, 162-164, normalizing of design weights, 53
180, 277, 295, 321, 325, 327 OLS regression equations, 32
link function, 105-110, 118, 125, 127, 133, 318 potential predictors, 61
log-linear model for variance, 322 predictors, 9, 16-17, 64, 90
log-linear structure for variance, 147, 151 printing of variances, 297
measures of variability, 40 random coefficients, 9, 13, 64, 74, 90, 98
model, 8-9, 16, 17, 26, 32, 52, 60, 63-64, 68, residual file, 16-17, 35, 38-40, 66, 79, 80, 82,
73, 89-90, 97, 104-106, 108, 114, 120, 148- 87, 296, 307, 317
152, 157, 161, 172, 212, 229, 235-236, 243, selecting equation for modeling, 28
Co
289, 293-294, 299, 308, 313, 316 sorting of data file, 69
multinomial sampling model, 105, 109-110, specifying model, 26, 28
136, 140 structural model, 16, 126, 138, 142
non-randomly varying coefficient, 10, 65 unit ID, 15, 17-18
number of variables, 23 variance, 10, 64, 78, 87, 116, 236, 246, 251,
Poisson sampling model, 107-108, 113-114, 297, 305
116, 133-134 Level-3
predictors, 9, 64, 90 between-cluster model, 221
py
random effect, 9-10, 12, 64-5, 74, 90, 104, 222, coefficients, 65, 91
236, 277 data file, 70, 96
removing intercept, 60 ID variable, 69
residuals, 16, 35, 37-38, 152, 275, 277-279 missing data, 72
sampling model, 105-111, 120, 132-136 model, 8, 14, 63, 66, 74, 89, 98, 106-107, 153-
sorting of data file, 18, 69 154, 183, 282, 308, 313
rig
structural model, 16, 105, 106, 126, 132, 134, predictors, 65, 91
138, 142, 146, 149, 152, 157, 172 random effects, 65-66, 91
testing homogeneity of variance, 56 residual file, 66, 79
unrestricted covariance structure, 147, 150-151, using design weights, 72
158-159, 162, 180, 184, 322 Level-4
variance, 8, 9, 49, 50-51, 56, 64, 90, 120, 126, coefficient, 91- 92
ht
132, 134-135, 147-148, 150-151, 159, 162- data file, 96
166, 169, 171, 180, 190, 195, 216, 218, 222- predictor, 91
233, 236, 295, 321-322, 325, 327 random effect, 91
weighting to population, 53-54 Likelihood ratio test, 12, 13, 56, 58, 67, 147, 150,
within-cell model, 209, 221-222 306
Level-2 Line plot, 259, 263-264, 266, 282-283
between-cell model, 221-222 Link function, 105-110, 118, 125, 127, 133, 318
©
coefficients, 9-14, 62, 64-66, 74, 79, 87, 90, identity, 106, 118, 125, 127, 318
116, 183, 223-224, 229, 244, 247, 293, 301 logit, 106-107, 109-110, 127, 133, 135
correlation matrix, 300 Listwise deletion, 43
data file, 18, 70, 95, 156 Log link function, 107-108, 133
example of a residual file, 80 Logit link function, 106-107, 109, 127, 133
example of removing fixed effect, 61 LOTUS input file, 26
SS
exploratory analysis of predictors, 61 LVRALPHA.DAT file, 298
fixed effect, 66
ID variable, 69 M
including predictors, 28
missing data, 44 Macro iterations, 51, 112, 121, 125-127, 131, 296,
model, 8-12, 16, 27-28, 35, 38-39, 48, 61, 63- 315-316, 318-320
67, 73, 89-90, 97, 107, 109-110, 149-151, Mahalanobis distance measures, 17, 38, 39, 79
I
153, 213, 223, 235, 247, 267, 293-294, 301, Make MDM – HLM2 dialog box, 24, 44
308-309, 313, 328 Make MDM – HLM3 dialog box, 71, 73
non-randomly varying effect, 66 Make MDM – HLM4 dialog box, 97
Make MDM – HMLM dialog box, 157 display mixed, 28, 74, 124
Maximum likelihood, 12, 112, 120 display subscripts, 28
estimates, 10-12, 66, 91, 224, 237 evaluating fit, 51, 167
full, 12, 47, 67, 75, 83, 92, 185, 188, 210, 214, for count data, 104, 107, 123, 133, 135
230, 245, 249, 296 for multi-category outcome, 105, 109, 123, 140
restricted, 12, 31, 47, 66, 185, 193, 293-294 for multiply-imputed data, 8, 43, 48, 186-190,
Maximum likelihood estimates, 11, 66, 91 296-298, 324, 328
MDM file, 23, 69, 155, 180, 192, 287-289, 315, for ordinal outcome, 105, 111, 123
321 level-1, 9, 89, 148, 209, 221-222, 294, 308, 313
Co
analyses based on, 172, 325-326 level-2, 9, 27, 90, 109-110, 149-150, 153, 221,
constructing, 17, 156 222, 294, 308, 313
constructing in interactive mode, 321 level-3, 221
data file formats, 17 nonlinear (HGLM), 48, 92, 104-108, 110, 112,
interactive analyses based on, 321 116, 119-121, 146, 183, 222, 224, 237, 269,
MDMT file, 22, 24, 310-311, 337-338, 342-343 298, 309, 310, 315, 337, 342
using ASCII data, 17-18, 23, 35, 72 population-average, 92, 118-119, 125, 127,
using SAS data, 17-18, 26, 35, 38, 45, 68, 73, 129, 132, 136, 141, 146, 224, 237, 269, 298
py
79, 286-287, 291, 296, 302-303, 307, 318, row factor predictor, 212, 213, 247
329 specification, 16, 26-28, 73-74, 82, 97-98, 103,
using SPSS data, 18, 26, 68, 72, 94, 155, 225, 209, 212, 216, 229, 243-244, 247
286, 302, 329 unit-specific, 92, 118-119, 125, 127, 129, 132,
using STATA data, 17, 26, 35, 38, 73, 79, 286- 136, 139, 141, 143, 146, 224, 237, 269, 298,
287, 296, 302, 329 309, 319
rig
using SYSTAT data, 17, 26, 35, 38, 68, 73, 79, unrestricted, 147, 150-151, 158-159, 162
286-287, 291, 296, 302-303, 307, 318, 329 with binary outcome, 104-106, 123-124, 145,
MDMT file, 22, 24, 310-311, 337-338, 342-343 222, 224, 315
Menu within-cell, 209, 221-222
Input File Type, 20, 26 without level-1 intercept, 60
Optional Specifications, 48-49, 55, 61 Model based graphs, 253, 259, 263, 266, 272, 275,
ht
Meta-analysis, 40, 190-191, 193 277, 279, 281-282
Micro iterations, 112, 121, 124-126, 131, 133-134, MQL, 116
136-137, 141, 290, 296-297, 315-316, 318-320 Multi-category data
Missing data, 18, 21, 24, 43-45, 52, 72, 148, 152, analysis of, 105, 109, 123, 140
180, 183-186, 207, 228, 241, 287-288, 303-304, Multinomial data, 109, 315
311, 326, 330, 337-338, 342-343, 350 example, 135
assigning code for, 18, 44-45, 72, 288 level-1, 105, 109-110, 136, 140
©
example of latent variable regression with, 326 unit-specific results, 141
handling of, 43 Multiple Imputation MDM files dialog box, 190
specifying, 21, 24, 72, 288, 303 Multiply-imputed data, 186, 324
MLM file, 180 analysis of, 8, 43, 48, 186-190, 296-298, 324,
example, 322 328
Model command in syntax, 328
SS
4-level, 8, 89, 91 -E flag, 328
as EMF file, 28 example, 190, 328
between-cell, 221-222 outcome and covariates, 190
between-cluster, 221 Multivarate data matrix file, 23, 69, 192, 287-289,
checking, 16, 35, 79 315
combined, 28, 118, 149-151, 153 Multivariate
comparison, 167 hypothesis tests, 54, 67, 92, 188, 306
I
298 OLS, 13
Row factor, 205-206, 213, 225-226, 234, 238, robust, 13, 33-35, 51, 58, 76, 78, 84, 86, 128-
330, 336, 340 130, 132, 140, 143-144, 233, 297-298
Starting values, 75, 308, 313 printing tau to file, 297-298, 309
correcting unacceptable, 47, 295, 308, 313 Variable exposure, 134, 315
STATA file Variance
making MDM from file, 17, 26, 35, 38, 73, 79, and covariance components, 10-12, 16, 34, 64
286-287, 296, 302, 329 chi-square test, 16
structural model, 16, 105-108, 110-111, 126, 132, combined model, 154
134-135, 138, 142, 146, 149, 152, 157, 172 components, 11
Subscripts first order AR at level-1, 147, 152, 160, 169
display in model, 28 heterogeneity at level-1, 16, 49, 147, 151
Co
Summary statistics, 15, 303 homogeneity of level-1, 56, 58, 150, 162-164,
Sustaining Effects Study data 180, 277, 295, 321, 325, 327
example, 63, 68, 74, 82, 171, 238 level-1, 8-9, 49, 50-51, 56, 64, 90, 120, 126,
SYSTAT data 132, 134-135, 147-148, 150-151, 159, 162-
data file, 73 166, 169, 171, 180, 190, 195, 216, 218, 222,
file input, 26 233, 236, 295, 321-322, 325, 327
making MDM from file, 17, 26, 35, 38, 68, 73, level-2, 10, 64, 78, 87, 116, 236, 246, 251, 297,
79, 286-287, 291, 296, 302-303, 307, 318, 305
py
329 log-linear at level-1, 147, 151, 322
matrix, 33, 151, 154, 297, 298
T posterior, 16-17, 39, 80-81
printing of level-2, 297
Tau, 33, 47, 151, 154, 187, 297, 298, 309 test for components, 56
Template file unrestricted structure, 147, 150-151, 158-159,
rig
for MDM, 22, 287, 304 162, 180, 184, 322
for model, 320, 322 within-cell, 222
Test statistic Variance-covariance components
chi-square, 12, 13, 16, 56, 58, 67, 92, 147, 150, average estimates, 188
224, 237 hypothesis tests, 12-13, 120
Thailand data level-2, 64
ht
example, 123 matrix for fixed effects, 146
Threshold, 112 multivariate tests, 56
t-ratio, 12, 67, 92 printing matrix, 297
Treatment effect V-known model
carry-over, 229 option, 40, 190-191, 193
T-to-enter statistic estimating in interactive mode, 192
for potential predictor, 61, 79, 290 example, 191
©
U W
Z
Var-cov matrix
printing gamma to file, 297-298 Z-focus variable, 265, 284