Tutorial On Biostatistics: Longitudinal Analysis of Correlated Continuous Eye Data

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

OPHTHALMIC EPIDEMIOLOGY

https://doi.org/10.1080/09286586.2020.1786590

Tutorial on Biostatistics: Longitudinal Analysis of Correlated Continuous


Eye Data
Gui-Shuang Yinga, Maureen G. Maguirea, Robert J. Glynnb, and Bernard Rosnerb
a
Center for Preventive Ophthalmology and Biostatistics, Department of Ophthalmology, Perelman School of Medicine, University of
Pennsylvania, Philadelphia, Pennsylvania, USA; bDivision of Preventive Medicine and the Channing Lab, Department of Medicine, Brigham and
Women’s Hospital, Boston, Massachusetts, USA

ABSTRACT ARTICLE HISTORY


Purpose: To describe and demonstrate methods for analyzing longitudinal correlated eye data with Received 11 November 2019
a continuous outcome measure. Revised 14 May 2020
Methods: We described fixed effects, mixed effects and generalized estimating equations (GEE) Accepted 19 June 2020
models, applied them to data from the Complications of Age-Related Macular Degeneration KEYWORDS
Prevention Trial (CAPT) and the Age-Related Eye Disease Study (AREDS). In CAPT (N = 1052), we Linear regression models;
assessed the effect of eye-specific laser treatment on change in visual acuity (VA). In the AREDS correlated data; inter-eye
study, we evaluated effects of systemic supplement treatment among 1463 participants with AMD correlation; longitudinal
category 3. correlation; fixed effects
Results: In CAPT, the inter-eye correlations (0.33 to 0.53) and longitudinal correlations (0.31 to model; mixed effects model;
0.88) varied. There was a small treatment effect on VA change (approximately one letter) at generalized estimating
equations
24 months for all three models (p = .009 to 0.02). Model fit was better with the mixed effects
model than the fixed effects model (p < .001).
In AREDS, there was no significant treatment effect in all models (p > .55). Current smokers had
a significantly greater VA decline than non-current smokers in the fixed effects model (p = .04) and
the mixed effects model with random intercept (p = .0003), but marginally significant in the mixed
effects model with random intercept and slope (p = .08), and GEE models (p = .054 to 0.07). The
model fit was better with the fixed effects model than the mixed effects model (p < .0001).
Conclusion: Longitudinal models using the eye as the unit of analysis can be implemented using
available statistical software to account for both inter-eye and longitudinal correlations. Goodness-
of-fit statistics may guide the selection of the most appropriate model.

Introduction The data from these studies have three types of cor­
Many ocular diseases affect both eyes of a subject, relation, including cross-sectional inter-eye correla­
such as age-related macular degeneration (AMD), tion within the same subject, longitudinal repeated-
glaucoma and myopia. Interventions to prevent or measures correlation within the same eye over time,
treat these ocular diseases can be eye-specific or per­ and cross-correlation between the outcome for one
son-specific. For example, in the Complications of eye at one time point and the outcome for the fellow
Age-related Macular Degeneration Prevention Trial eye at a different time point. Inter-eye correlation at
(CAPT), laser treatment was eye-specific. One eye of one point in time or longitudinal correlation for one
a subject received laser treatment and the contralat­ eye over time individually are well recognized and
eral (fellow) eye was not treated.1 In the Age-related statistical models have been developed to appropri­
Eye Disease Study (AREDS),2 the intervention of diet­ ately account for both of these types of correlation.3,5
ary supplements was a systemic treatment so that However, statistical models that simultaneously
both eyes received the same treatment. The primary account for these types of correlation are less widely
outcome measure from clinical trials in ophthalmol­ known. The purpose of this paper is to introduce
ogy is commonly eye-specific, such as measurement appropriate statistical models to account for these
of visual acuity, visual field, or refractive error. These correlations from longitudinal data from both eyes,
outcomes are often measured multiple times during and to demonstrate how to apply these models to
the follow-up period, providing longitudinal data. datasets from the CAPT and the AREDS.

CONTACT Gui-shuang Ying [email protected] Center for Preventive Ophthalmology and Biostatistics, Department of Ophthalmology,
Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
© 2020 Taylor & Francis Group, LLC
2 G.-S. YING ET AL.

Methods Fixed effects model


Data structure of longitudinal correlated eye data The general equation of the fixed effects model for long­
itudinal eye data with covariates ðxÞ and time (t) as
To analyze longitudinal data from both eyes using
independent variables is:
statistical software, the data usually are laid out with
XL XL
each row representing an outcome measure from an yijk ¼ α þ βtk þ γ xijl þ δ t x þ eijk (1)
l¼1 l l¼1 l k ijl
eye at a specific time point (Table 1). The data
usually consist of an ID variable indicating subject where i represents subject, j represents eye, k represents
identification number, an EYE variable indicating time point, α is the intercept (when t and all covariates
which eye (left eye, right eye) was measured, are set to a value of 0); β is the rate of change in y for
a TIME variable indicating the visit when the out­ every unit change of t (i.e., slope) when all covariates are
come measure was taken, a GROUP variable indi­ set to a value of 0; γl is the difference in intercept of y per
cating which treatment group the eye was in unit change in the lth covariate at baseline, and δl is the
(GROUP has the same value for both eyes if treat­ change in slope of y per unit change in the lth covariate.
ment is systemic), an OUTCOME variable for an The covariates xijl in the model do not change with
eye-specific continuous outcome measure, and pos­ time k, however, the model can be expanded to include
sibly additional covariates X1, X2, . . . . . ., Xk. The time-varying covariates xijkl .
covariates can be either subject-specific or eye- When there are two treatment groups (treatment can
specific. be either eye-specific or systemic) and no other covari­
For the analysis of longitudinal data from both ates, the above equation can be simplified to:
eyes of a subject, three types of correlations need yijk ¼ α þ βtk þ γxij þ δtk xij þ eijk (2)
to be accounted for: (1) the cross-sectional inter-
eye correlation at each time point; (2) the long­ where xij = 1 for the active treatment group, xij = 0 for
itudinal correlation among repeated measures in the control group. If both eyes of a subject are in the
the same eye over time; and (3) the cross- same treatment group, then xij = xi. In this specified
correlation between the outcome for one eye at model, α is the intercept (when t and x were set to
one time point and the outcome for the fellow a value of 0),β is the slope per unit time in the control
eye at a different time point. We previously group, β þ δ is the slope per unit time in the active
described the analysis for correlated eye data group, and γis the mean difference between the active
using the fixed effects model, the mixed effects group and the control group at baseline.
model, 3 and the population-average model using In the fixed effects model, the regression coefficients are
generalized estimating equations (GEE) 4 to assumed to be the same for all subjects (as regression
account for the inter-eye correlation in continu­ coefficients α, β,γand δare all not associated with subject
ous eye data from cross-sectional studies. 5 These i or eye j). The fixed effects models are most applicable to
modelling approaches can be extended to analyze longitudinal studies with fixed visit times as in clinical trials
longitudinal eye data by specifying the model and with scheduled visits, because the visit variable used to
the correlation structures to account for all three specify the correlation structure has to be categorical. In
types of correlation as described below. SAS, a fixed effects model is fit using PROC MIXED with
the REPEATED option, where the covariance (or correla­
tion) structures for both inter-eye correlation and longitu­
Table 1. Layout of longitudinal correlated eye data for statistical dinal correlation of repeated measures are specified.
analysis. SAS offers three correlation structures that can
EYE GROUP Baseline Covariates accommodate both inter-eye correlation and longitudi­
ID (L,R) TIME (0,1) OUTCOME (X1, X2, . . . . . ., Xk)
nal correlation in the same model including UN@UN,
1 L 0 0 y1L1
1 R 0 1 y1R1 UN@CS and UN@AR (Table 2), where UN stands for
1 L 1 0 y1L2 unstructured covariance that allows all components of
1 R 1 1 y1R2
1 L 2 0 y1L3 the covariance matrix to be different, CS stands for
1 R 2 1 y1R3 compound symmetry, and AR stands for first-order
2 L 0 1 Y2L1
2 R 0 0 Y2R1
autoregressive and @ denotes matrix direct product.
2 L 1 1 Y2L2 The UN@UN correlation structure specifies that both the
2 R 1 0 Y2R2 correlation matrix between eyes and the longitudinal corre­
2 L 2 1 Y2L3
2 R 2 0 Y2R3 lation over time are unstructured. The first UN corresponds
:: :: :: :: :: to the cross-sectional covariance between left and right eyes
OPHTHALMIC EPIDEMIOLOGY 3

and the second UN corresponds to the longitudinal covar­ than with the UN@UN. The features of UN@CS and
iance structure. If there are K time points, the product of UN@AR as compared to UN@UN are shown in Table 2.
these two covariance matrixes will lead to a matrix with In the longitudinal data analysis literature, the model in
dimension of 2 K x 2 K. For example, if there are 3-time Equations 1 and 2 is also sometimes referred to as
points, the product of the inter-eye covariance matrix and a Covariance Pattern model.6
longitudinal covariance matrix will be a 6 × 6 matrix, and
have 8 parameters to estimate (3 from the cross-sectional
component, and 5 from the longitudinal component). Mixed effects model
As shown in Table 2, the covariance matrix of UN@UN The mixed effects model contains both fixed effects and
has the attributes of allowing different variances across random effects. Regression coefficients for random
different time points and across different eyes, and can effects are assumed to vary among different individuals.
accommodate the inter-eye correlation (assumed to be the Random effects may be specified for both the intercept
same at different time points), longitudinal correlation and time parameters for both the subject and the eye
(allowed to be different for different pairs of time points, within the subject. Through specification of random
but the same for left eye and right eye) and cross correlation. effects for longitudinal measures over time and mea­
Similar to UN@UN, the UN@CS and UN@AR can also sures from two eyes of a subject, the mixed effects
accommodate all correlations (e.g., inter-eye correlation, model explicitly accounts for the repeated measures
longitudinal correlation and cross correlation), but with correlation and inter-eye correlation. An important
stronger assumptions for the longitudinal correlation struc­ assumption of the mixed effects model is that the dis­
ture. UN@CS assumes an unstructured inter-eye covariance tribution of the random effects is assumed to be normal
(three parameters), and a compound symmetry correlation (i.e., Gaussian) and random effects are independent of
structure (with only one parameter) for longitudinal correla­ all the covariates. In addition, covariate effects can be
tion, while UN@AR assumes an unstructured inter-eye cov­ estimated for both a single individual as well as an
ariance (three parameters), and a first-order autoregressive average over many individuals. The mixed effects
correlation structure (with only one parameter) for long­ model requires correct specifications for both fixed
itudinal correlation. As both UN@CS and UN@AR only effects and random effects. In SAS, the mixed effects
have four parameters to estimate, they are less computation­ model is executed using PROC MIXED through
ally intensive and less likely to have a convergence problem a RANDOM statement.

Table 2. The features of covariance structure of UN@UN, UN@CS and UN@AR for fixed effects model of the longitudinal correlated data
with 3-time points.
UN@UN UN@CS UN@AR
� � � � � �
Inter-eye covariance matrix σ2L σLR σ2L σLR σ2L σLR
σLR σ2R σLR σ2R σLR σ2R
0 1 0 1 0 1
Longitudinal covariance matrix 1 λ12 λ13 1 λ λ 1 λ λ
@ λ22 λ23 A @ 1 λA @ 1 λA
λ33 1 1

Product of inter-eye and longitudinal covariance matrix σ2L σ2L λ12 σ2L λ13 σLR σLR λ12 σLR λ13 σ2L σ2L λσ2L λσLR σLR λσLR λ σ2L σ2L λσ2L λσLR σLR λσLR λ
σ2L λ12 σ2L λ22 σ2L λ23 σLR λ12 σLR λ22 σLR λ23 σ2L λσ2L σ2L λσLR λσLR σLR λ σ2L λσ2L σ2L λσLR λσLR σLR λ
σ2L λ13 σ2L λ23 σ2L λ33 σLR λ13 σLR λ23 σLR λ33 σ2L λσ2L λσ2L σLR λσLR λσLR σ2L λσ2L λσ2L σLR λσLR λσLR
σLR σLR λ12 σLR λ13 σ2R σ2R λ12 σ2R λ13 σLR σLR λσLR λσ2R σ2R λσ2R λ σLR σLR λσLR λσ2R σ2R λσ2R λ
σLR λ12 σLR λ22 σLR λ23 σ2R λ12 σ2R λ22 σ2R λ23 σLR λσLR σLR λσ2R λσ2R σ2R λ σLR λσLR σLR λσ2R λσ2R σ2R λ
σLR λ13 σLR λ23 σLR λ33 σ2R λ13 σ2R λ23 σ2R λ33 σLR λ σLR λ σLR σ2R λσ2R λσ2R σLR λ σLR λ σLR σ2R λσ2R λσ2R

Variance of outcome for the left (right) eyes at visit k σ2L λkk ðσ2R λkk Þ σ2L ðσ2R Þ σ2L ðσ2R Þ

σLR σLR σLR


Inter-eye correlation at one visit k σ L σR σL σR σL σR

Longitudinal correlation between visits k1 and k2 pffiffiλffikffi1ffiffikffi2ffiffiffiffiffiffi λ λjk1 k2 j


λk1 k1 λk2 k2
� � � � � �
Cross correlation between one eye at visit k1 and the fellow eye σLR pffiffiλffikffi1ffiffikffi2ffiffiffiffiffiffi σLR
λ σLR
λjk1 k2 j
σL σR λk 1 k 1 λk 2 k 2 σ L σR σL σR
at another visit k2
Total number of parameters to estimate 8 4 4
UN = unstructured, CS = compound symmetry, AR = autoregressive
4 G.-S. YING ET AL.

Below, we describe two common mixed effects mod­ In the mixed effects model with a random intercept,
els including a mixed effects model with random inter­ the varianceðyijk Þcan be estimated as σ 2u þ σ 2v þ σ 2e ,
cept, and a mixed effects model with random intercept where σ 2u is between person variance, σ 2v is between eye
and slope. (within person) variance, and σ 2e is within eye (replicate)
variance. Since the varianceðyijk Þis not a function of
j (eye) or k (visit), it implies that the variance is the
Mixed effects model with random intercept same across all time points and the same for the left
eye and right eye. This may not be true for some long­
The mixed effects model with random intercept for
itudinal data.
longitudinal eye data with treatment group (x) and
The mixed effects model with random intercept also
time (t) as independent variables is specified as:
assumes that the longitudinal correlation between out­
yijk ¼ ðα þ ui þ vij Þ þ βtk þ γxij þ δtk xij þ eijk ; comes for the same eye over time is the same for each
(3) pair of different time points, since
i ¼ 1; n; N; j ¼ 1; 2; k ¼ 1; n; K
where ui and v ij are person-specific and eye-specific Corrðyijk1 ; yijk2 Þ ¼ ðσ u 2 þ σ v 2 Þ=ðσ u 2 þ σ v 2
random effects, respectively, and eijk is the random þ σ e 2 Þ; k1 �k2 (4)
error. u i is distributed as Nð0; σ 2u Þ, vij is distributed Finally, the inter-eye correlation is assumed to be the
as Nð0; σ 2v Þ, and e ijk is distributed as Nð0; σ 2e Þ. ui, v ij same at each visit and is the same as the cross correla­
and e ijk are assumed mutually independent each tion, which can be estimated as
other. The covariance matrix of the three random
effects terms ui, vi0 and vi1 , is denoted by Corrðyij1 k1 ; yij2 k2 Þ ¼ σ u 2 =ðσ u 2 þ σ v 2 þ σ e 2 Þ; j1 �j2 (5)
G (Table 3). So the mixed effects model with random intercept has
With xij = 1 for the active treatment group, xij = 0 seven parameters including four mean parameters
for the control group, the regression coefficient αcan (α; β; γ; δÞ, and three variance parameters (σ 2u ; σ 2v ; σ 2e ) to
be interpreted as the estimated mean outcome at estimate.
baseline for the control group. The intercept for
a specific person i and eye j in the control
group isα þ ui þ vij . Mixed effects model with random intercept and
γ is the mean difference in outcome at baseline slope
between the treatment and control group
β is the slope in the control group, assumed to be the The equation for the mixed effects model with random
same for all subjects in the control group, intercept and slope that includes treatment group x and
β þ δis the slope in the active group, assumed to be time t is:
the same for all subjects in the active group, yijk ¼ ðα þ ui þ vij Þ þ ðβ þ wi þ zij Þtk þ γxij þ δtk xij
and δis the difference of slope over time between þ eijk
treatment group and control group, which is of primary
(6)
interest in clinical trials.

Table 3. Features of the covariance structure under random effects models for longitudinal correlated data with 3-time points.
Random intercept for subject Random intercept and slope for subject;
and eye within subject random intercept for eye within subject Random intercept and slope for subject; random
(equation 3) (equation 6) slope for eye within subject (equation 6)
Vector of random ðui ; vi0 ; vi1 Þ ðui ; wi ; vi0 ; vi1 Þ ðui ; wi ; zi0 ; zi1 Þ
effects (transpose) 0 1 0 1 0 1
Covariance matrix for σ2u 0 0 σ2u σuw 0 0 σ2u σuw 0 0
random effects (G) @ 0 σ2 0 A B σuw σ2 0 0 C B σuw σ2w 0 0 C
v B w C B C
0 0 σ2v @0 0 σ2v 0 A @0 0 σ2z 0 A
2 2
0 0 0 σv 0 0 0 σz
Variance of outcome σ2u þ σ2v þ σ2e σ2u þ 2tk σuw þ tk2 σ2w þ σ2v þ σ2e σ2u þ 2tk σuw þ tk2 σ2w þ tk2 σ2z þ σ2e
in an eye at visit k
Inter-eye correlation at σ2u σ2u þ2tk σuw þtk2 σ2w σ2u þ2tk σuw þtk2 σ2w
σ2u þσ2v þσ2e σ2u þ2tk σuw þtk2 σ2w þσ2v þσ2e σ2u þ2tk σuw þtk2 σ2w þσ2v þσ2e
one visit k
Longitudinal σ2u þσ2v 2 2 2 2 2 2

σ2u þσ2v þσ2e


pffiffiffi2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiσffiffiuffiþðt
ffi2ffiffiffiffikffi21ffiffiþt ffiffiffiffi2kffi1ffitffikffi2ffiffiσffiffiwffiffiþσ
ffiffiffikffi22ffiffiÞσffiffiffiuwffi2ffiffiþt ffiffiffiffivffiffiffiffi2ffiffiffiffi2ffiffiffiffiffiffi2ffiffiffiffiffiffi2ffiffi pffiffiffiffi2ffiffiffiffiffiffiffiffiffiffiffiffiffiσffiffiuffiþðt
ffi2ffiffiffiffikffi21ffiffiþt ffiffi2ffiffikffi1ffitffi2kffi2ffiffiσffiffiwffiffiþt
ffiffiffik2ffi2ffiffiÞσffiffi2ffiuwffiffiffiþt ffiffiffikffi1ffiffitffikffi2ffiffiσffi2ffizffiffiffiffi2ffiffiffiffiffi2ffiffiffiffiffi2ffiffiffiffiffi2ffiffiffi
correlation between ðσu þ2tk1 σuw þtk σw þσv þσe Þðσu þ2tk2 σuw þtk σw þσv þσe Þ
1 2
ðσu þ2tk1 σuw þtk σw þtk σz þσe Þðσu þ2tk2 σuw þtk σw þtk σz þσe Þ
1 1 2 2
visits k1 and k2
OPHTHALMIC EPIDEMIOLOGY 5

where ui ,Nð0; σ 2u Þ; wi ,Nð0; σ 2w Þare person-specific both the inter-eye correlation and cross correlation
random effects for the intercept and slope respectively, are also allowed to vary over time. So the mixed
vij ,Nð0; σ 2v Þ; zij ,Nð0; σ 2z Þ are eye-specific random effects model with random intercept and random
effects for the intercept and slope respectively, slope offers more flexibility than both the fixed
and eijk ,Nð0; σ 2e Þ is the residual effect controlling effects model and mixed effects model with random
for the person and eye-specific fixed and random intercept.
effects. The person and eye-specific random effects In both the mixed effects model and fixed effects
are independent of each other, but the random effects model, the goodness of fit can be assessed using the
for the slope and intercept for a subject (ui, wi) may log-likelihood (−2lnL) or Akaike’s Information
be correlated and the random effects for the slope Criteria (AIC). The −2lnL and AIC can be used to
and intercept of an eye (vij, zij) may also be corre­ identify the most appropriate covariance structure.
lated. The covariance matrix of the random effects The smaller −2InL and AIC indicate a better fitting
(ui, wi, vi0, vi1, zi0, zi1) is denoted G. In many exam­ model.
ples, the estimated G matrix for the model in
Equation 6 may not be positive definite. In this
case, a reduced model that drops either the two Population-average model
random eye-specific intercepts (vi0, vi1) or the two
random eye-specific slopes (zi0, zi1) can often be The population-average model (or marginal model)
fitted. using the GEE approach provides an estimate of
With xij = 1 for the active treatment group, xij = 0 for changes in the population mean corresponding to
the control group, changes in covariates. Although GEE was initially
α; β are the average intercept and slope, respectively, developed to analyze correlated data from longitu­
in the control group, dinal repeated measures, 4 it also applies to the
ðαþγÞ and ðβþδÞare the average intercept and slope, analysis of correlated eye data. 5 Different from the
respectively, in the treatment group, fixed effects model or mixed effects model, the GEE
α þ ui þ vij is the intercept for the jth eye of the ith approach does not require distributional assump­
subject in the control group, tions because estimation of the population-average
β þ wi þ zij is the slope for the jth eye of the ith subject model depends only on correctly specifying the
in the control group, linear function relating the mean outcome to the
α þ γ þ ui þ vij is the intercept for the jth eye of the ith covariates. GEE models only estimate effects for an
subject in the treatment group, average person (or eye) with specific covariate
β þ δ þ wi þ zij is the slope for the jth eye of the ith values. As GEE is a marginal model approach, no
subject in the treatment group. random effects are introduced. Instead, empirical
The model has 11 parameters to estimate including methods such as the robust sandwich estimator
for the variance of the estimated regression coeffi­
ðα; β; γ; δÞ for the mean and ðσ 2u ; σ 2v ; σ 2w ; σ 2z ; σ 2e Þ for the
cients are employed to adjust the SE’s of the
variance, ρuw and ρvz for correlations between random
regression estimates for the correlated outcome
effects. P data. The marginal model takes account of all cor­
Let ck ¼ ð1; 1; tk ; tk Þ be a 1 × 4 vector, uvwz be relations by estimating the covariance among the
a variance-covariance P matrix ofðu; v; w; zÞ; then
residuals from a single subject, assuming the resi­
varðyijk Þ ¼ ck uvwz ck 0 þ σ 2e depends on tk .
duals from a subject are correlated, while the stan­
Thus, the variance of yijk may increase or decrease
dard linear regression model assumes the residuals
over time, but is assumed to be the same for left and
are independent with a constant variance. The
right eyes. P 0 marginal model is usually executed in SAS using
Similarly, covðyijk1 ; yijk2 Þ ¼ ck1 uvwz ck2
PROC GENMOD. The REPEATED statement in
also depends on time, thus the longitudinal correla­
this procedure allows the specification of various
tion may be dependent ontk1 andtk2 .
correlation structures including Unstructured
In the mixed effects model with random inter­
(UN), compound symmetry (CS) and the working
cept and random slope, the variance of yijk is
independence (IND).
allowed to vary over time, but the variance for
For the longitudinal data measured from two
the left and right eye are assumed to be the same.
eyes of each subject at 3-time points, the correla­
Similarly, the longitudinal correlations are allowed
tion structure (6 X 6) of UN, CS and IND are as
to be different depending on the time points, and
follows:
6 G.-S. YING ET AL.

0 1
σ 21 ρ1 ρ2 ρ3 ρ4 ρ5 with the same covariates under various correlation
Bρ σ 22 ρ6 ρ7 ρ8 ρ9 C structures, thus guiding the selection of a proper corre­
B 1 C
B C lation structure. The model with the smaller QIC statis­
B ρ2 ρ6 σ 23 ρ10 ρ11 ρ12 C
UN ¼ B

C tic is preferred.8 QICu can be used to compare various
B 3 ρ7 ρ10 σ 24 ρ13 ρ14 C
C models with different covariates but with the same cor­
B C
@ ρ4 ρ8 ρ11 ρ13 σ 25 ρ15 A relation structure.
ρ5 ρ9 ρ12 ρ14 ρ15 σ 26 We demonstrate the application of the fixed effects
model, mixed effects model, and marginal models to analyze
0 1 longitudinal correlated eye data from two clinical trials as
σ2 ρ ρ ρ ρ ρ described below. The institutional review board associated
B ρ σ2 ρ ρ ρ ρ C
B C with each clinical center approved the study protocol and
B 2 C
B ρ ρ σ ρ ρ ρ C informed consent was obtained from each patient, and each
CS ¼ B
B
C
B ρ ρ ρ σ2 ρ ρ C
C
study adhered to the tenets of the Declaration of Helsinki.
B C In the analyses of clinical trial data using fixed effects
@ ρ ρ ρ ρ σ2 ρ A
models and mixed effects models, we used the
ρ ρ ρ ρ ρ σ2 DDFM = KR option to calculate the degrees of freedom
for testing fixed effects as detailed by Kenward and
0 1 Roger (1997).9 A simulation study by Schaalje et al.
σ2 0 0 0 0 0
B found that the KR method works reasonably well with
B 0 σ2 0 0 0 0 C
C
B C various covariance structures when sample sizes are
B 0 0 σ2 0 0 0 C moderate to small and the design is reasonably balanced
IND ¼ B
B
C
B 0 0 0 σ2 0 0CC while the other methods for DDFM did not work as well
B C
@ 0 0 0 0 σ2 0 A as the KR method in some settings.10 All statistical
0 0 0 0 0 σ2 analyses were performed in SAS 9.4 (SAS Institute Inc.,
Cary, NC), and the SAS codes are included in the
The number of parameters to estimate is 21 for UN, 2 for Appendices 1 and 2. Similar codes in STATA
CS, and 1 for working Independence. (Appendix 3) and R (Appendix 4) are also provided.
Although the “working independence” correlation struc­
ture appears to ignore the inter-eye correlation and long­
Example 1: Analysis of visual acuity data from the
itudinal correlation by specifying a correlation of 0, the GEE
complications of age-related macular degeneration
approach uses a robust variance estimator that provides
prevention trial (CAPT)
asymptotically unbiased estimates for the regression coeffi­
cients, which are the same as those from the standard linear The CAPT was a multi-center randomized clinical trial to
regression model, but their standard errors are adjusted for evaluate whether low-intensity laser treatment of eyes with
the correlated data. With the compound symmetry correla­ drusen prevents vision loss from AMD.1 The study enrolled
tion structure and unstructured correlation structure, both 1,052 participants (2,104 eyes) with age at least 50 years, at
the estimated regression coefficients and standard errors least 10 large drusen (retinal deposits) in each eye, and visual
may differ from the standard linear regression model. acuity (VA) at least 20/40 in each eye. The study randomly
When there are several time points with repeated measures, assigned one eye to laser treatment, and the fellow eye of
using an unstructured correlation structure requires esti­ a participant to observation (without any intervention). The
mating many covariance parameters, and substantially visual acuity of each eye was measured using modified
increases the computation time and possibly leads to com­ ETDRS charts at baseline, 6 months, and annually for at
putational convergence problems. least 5 years. The primary outcome was the visual acuity
In GEE, when there is little knowledge available to score calculated as the total number of letters read correctly
choose among correlation structures, the model good­ from the ETDRS charts. For this example, we analyzed
ness of fit statistics can be used to find an acceptable visual acuity measured at baseline, and months 12, 24, 36,
working correlation structure including the Quasi- 48, 60. We evaluated the treatment effect and the effect of
likelihood (Q) under the Independence model smoking on visual acuity using the fixed effects model,
Criterion (QIC) and the related QICu.7 QIC is analo­ random effects models and GEE. In these models, time (in
gous to the AIC statistic used for comparing models fits months) was fitted as a categorical variable with 6 levels (e.g.,
with likelihood-based methods. Since it is not 0, 12, 24, 36, 48 and 60 months) because the visual acuity
a likelihood-based method, the AIC statistic is not avail­ change over time did not appear to be linear. For the mixed
able in GEE. QIC can be used to compare GEE models effects model with random intercept and random slope,
OPHTHALMIC EPIDEMIOLOGY 7

including both random intercept and slope in any of the up. The inter-eye correlations assessed using Pearson
random statements for subject or eye within subject caused correlation coefficients ranged from 0.33 to 0.53 and
a non-positive-definite G-matrix (matrix of random effects). were highest at baseline. The longitudinal correlation
Thus, we fitted the mixed effects model using a random visit coefficients for pairs of visits ranged from 0.31 to
effect for both subject and eye within the subject, but not 0.88, diminished with a longer time between repeated
a random intercept. measures, and tended to be higher between pairs of
visits at later time points (Table 4). The longitudinal
correlation coefficients were similar in treated eyes
Example 2: Analysis of data from the age-related and fellow eyes (Table 4).
eye disease study (AREDS) Over 5 years of follow-up, the mean visual acuity
decreased over time, while the variation (e.g., standard
The AREDS AMD trial was a multi-center randomized
deviation [SD]) of visual acuity increased over time, with
clinical trial to evaluate the effect of high-dose vitamin
mean (SD) of visual acuity 82 (6) letters at baseline and
C and E, beta carotene and zinc supplements on AMD
73 (18) letters at year 5 in both treated eyes and observa­
progression and visual acuity.
tion eyes (Table 4).
The study enrolled 1063 participants who had exten­
The multivariable analysis results for treatment effect
sive small drusen, retinal pigment abnormalities, or at
and other covariates from the naïve model (ignoring
least 1 intermediate size drusen (AMD category 2), 1463
both inter-eye and longitudinal correlations), fixed
participants who had extensive intermediate drusen,
effects model, mixed effects models (random intercept
non-central geographic atrophy, or at least one large
or random visit), and GEE models (using compound
druse (AMD category 3); and 956 participants who had
symmetry, or working independence) are shown in
advanced AMD or visual acuity less than 20/32 due to
Table 5. Overall, the mean visual acuity decreased over
AMD in one eye (AMD category 4).2 All participants
time with about a 14-letter decline from baseline to
were randomly assigned to receive daily oral tablets
5 years (p < .0001, Table 5). There was a non-
containing: (1) antioxidants (vitamin C, 500 mg; vitamin
monotone effect of treatment over time with a small
E, 400 IU, and beta carotene, 15 mg); (2) zinc, 80 mg, as
but significant effect of treatment on VA (estimated as
zinc oxide and copper, 2 mg, as cupric oxide; (3) anti­
mean difference in VA between treated eyes and control
oxidants plus zinc; or (4) placebo. The participants were
eyes) for all five models at 24 months (fixed effects
followed for outcome assessment every 6 months for at
model: mean (SE) = 0.98 ± 0.43 letters, p = .02; mixed
least 5 years. For the purpose of demonstration, we
effects model with random visit: mean (SE) = 0.98 ± 0.40
restricted the analyses to the 1463 participants (2334
letters, p = .02; mixed effects model with random inter­
eyes, 60% bilateral) with AMD category 3 at baseline.
cept: mean (SE) = 1.03 ± 0.47 letters, p = .03; GEE model
We analyzed the AREDS data to evaluate the effects of
using compound symmetry or working independence:
treatment, age, hypertension status and smoking on the
mean (SE) = 1.05 ± 0.40 letters, p = .009). At 36 months,
rate of change in visual acuity during follow-up using
the estimated treatment effect was similar (mean differ­
a fixed effects model, random effects models and GEE.
ence approximately 1 letter, all p < .05, Table 5), but the
In these models, time (in months) was fitted as
treatment effect on VA at 60 months was minimal
a continuous variable. We initially fitted a mixed effects
(mean VA difference ≤0.25 letters) and not statistically
model with random intercept and random slope for both
significant (all p ≥ 0.67). Unexpectedly, in four models
subject and eye within a subject. However, there was little
(fixed effects model, mixed effects model with random
variation in visual acuity attributed to the random inter­
visit, and GEE model using compound symmetry or
cept for eye within a subject, leading to a non-positive-
working independence), current cigarette smokers had
definite G-matrix. Thus, we fitted the mixed effects model
better VA than former or non-smokers at baseline (dif­
using a random intercept and random slope for subject,
fered by 1.5 letters, p < .05) and at 12 months (differed by
and only random slope for eye within a subject.
2.1 letters, p < .05), but the difference was not significant
by 60 months. The mixed effects model using a random
intercept only and the naïve model provided somewhat
Results different results for the smoking effect when compared
to other models, without any statistically significant
Effect of treatment and smoking on visual acuity in
difference at baseline and at 12 months mainly due to
CAPT
the much larger estimate of standard error than in the
Among 1052 participants enrolled into the CAPT, other models, but with a statistically significant differ­
917 (87%) participants completed the 5-year follow- ence at 60 months (p < .05, Table 5). The goodness of
8 G.-S. YING ET AL.

Table 4. Visual acuity in the treated eye and control eye over time and their cross-sectional and longitudinal correlations (Pearson
Correlation) in CAPT participants (N = 1052).
Time points (months)
0 12 24 36 48 60
# of Patients N 1052 1035 1008 970 941 917
Control eye Mean (SD) VA score in letters 82.1 (6.1) 80.7 (8.9) 79.0 (11.4) 76.8 (13.6) 75.3 (15.6) 73.1 (17.7)
Treated eye Mean (SD) VA score in letters 82.2 (6.2) 81.0 (9.0) 80.1 (10.7) 77.9 (13.1) 75.9 (15.1) 72.9 (17.7)
Pearson Correlation Coefficient
Control eye 0 1.00 0.58 0.49 0.41 0.37 0.32
12 1.00 0.75 0.65 0.57 0.51
24 1.00 0.79 0.70 0.64
36 1.00 0.85 0.77
48 1.00 0.88
60 1.00
Treated eye 0 1.00 0.59 0.45 0.41 0.36 0.31
12 1.00 0.70 0.63 0.55 0.46
24 1.00 0.81 0.69 0.60
36 1.00 0.84 0.73
48 1.00 0.86
60 1.00
Combined 0 1.00 0.58 0.47 0.41 0.36 0.32
12 1.00 0.73 0.64 0.56 0.48
24 1.00 0.80 0.70 0.62
36 1.00 0.84 0.75
48 1.00 0.87
60 1.00
Inter-eye correlation Pearson ρ 0.53 0.36 0.33 0.36 0.36 0.34

model fit assessed using −2lnL and AIC (the smaller the right eyes, 77 (15) letters at 5-year for left eyes and 76
better) for the fixed effects model and mixed effects (17) letters for right eyes (Table 6).
models and using QIC or QICu for GEE models are The multivariable analysis results from the naïve
shown in Table 5. The model fit was significantly better model, fixed effects model, mixed effects models (ran­
with the mixed effects model using random visit vs. the dom intercept with or without random slope) and GEE
fixed effects model:Δð 2 ln LÞ ¼ 209:6, chi-square 19 df, models (using compound symmetry or working inde­
p < .001. The mixed effects model using random inter­ pendence) are shown in Table 7. In these models that
cept and the naïve model fit the data poorly as indicated considered time as a continuous variable, the mean VA
by high values of −2lnL and AIC (Table 5). The GEE decreased over time with mean annual decline of
models using compound symmetry and working inde­ approximately 1.5 letters (p < .0001, Table 7). The base­
pendence provided similar goodness of fit values. line VA means were similar across the four treatment
groups (p ≥ 0.15) and there was no significant treatment
effect on VA in all models (p ≥ 0.55 for test of interaction
Effect of treatment and factors associated with between time and treatment).
visual acuity outcome in AREDS Older age was associated with worse baseline VA
Among 1463 participants (871 bilateral) with AMD (0.34 letters worse for every year difference in age,
category 3 at baseline in their eligible eyes (2334 eyes) p < .0001, Table 7), and was also associated with more
for this analysis, 1217 (83%) participants (1932 eyes) decline in VA during follow-up (0.08 letters for
completed 5 years of follow-up. The inter-eye correla­ every year increase, p < .0001). Participants with hyper­
tions assessed among bilateral cases using Pearson cor­ tension tended to have worse VA at baseline (mean
relation coefficients ranged from 0.26 to 0.48 and were difference of 0.4 to 0.6 letters, with p-value 0.04 to 0.21
highest at baseline. The longitudinal correlation coeffi­ depending on models, Table 4). However, hypertension
cients ranged from 0.26 to 0.90, diminished with longer had no effect on the VA change over time (p ≥ 0.61).
time between repeated measures, and tended to be Current smokers at baseline had worse baseline VA than
higher between pairs of visits at later time points non-current smokers with the mean difference ranging
(Table 6). The longitudinal correlation coefficients from 1.5 letters (GEE model with compound symmetry)
were similar in left eyes and right eyes (Table 6). to 2.0 letters (naïve model and GEE model with working
Over the 5-year follow-up, the mean visual acuity independence) and p-values ranging from 0.04 (mixed
decreased over time while the variation (e.g., SD) of effects model with random intercept) to <0.0001 (fixed
visual acuity increased over time, with mean (SD) visual effects model, Table 7). Current smokers also had
acuity 84 (6) letters at baseline for both left eyes and a greater VA decline over time than non-current
Table 5. The comparison of results for effects of treatment and cigarette smoking on visual acuity (letters) change over time in the CAPT study (n = 1052 subjects, 11846 eye visits).
Naïve model
(ignore inter-ye correlation Mixed effects§ Mixed effects§ GEE£ GEE£
Time and longitudinal correlations) Fixed effects┼ (Random visit) ∆ (Random intercept) (Compound symmetry) (Working independence)
Effect (mons) Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value
Visit <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001
0 (ref) (ref) (ref) (ref) (ref) (ref)
12 −0.78 (0.65) 0.65 −0.88 (0.80) 0.27 −0.88 (0.77) 0.25 −0.86 (1.12) 0.44 −0.86 (0.61) 0.16 −0.78 (0.62) 0.21
24 −4.27 (1.73) 0.01 −4.63 (1.08) <0.0001 −4.63 (1.08) <0.0001 −4.64 (1.14) <0.0001 −4.62 (1.06) <0.0001 −4.27 (1.11) 0.0001
36 −6.42 (1.73) 0.0002 −6.79 (1.34) <0.0001 −6.78 (1.37) <0.0001 −6.69 (1.14) <0.0001 −6.68 (1.36) <0.0001 −6.42 (1.39) <0.0001
48 −7.89 (1.76) <0.0001 −8.50 (1.60) <0.0001 −8.51 (1.65) <0.0001 −8.23 (1.16) <0.0001 −8.19 (1.49) <0.0001 −7.89 (1.55) <0.0001
60 −13.2 (1.77) <0.0001 −13.5 (1.89) <0.0001 −13.6 (1.94) <0.0001 −13.5 (1.17) <0.0001 −13.5 (1.99) <0.0001 −13.2 (2.06) <0.0001

Laser vs. no treatment 0.19 0.01 0.03 0.06 0.03 0.03


0 0.03 (0.55) 0.96 0.04 (0.23) 0.86 0.03 (0.18) 0.89 0.03 (0.46) 0.96 0.03 (0.18) 0.89 0.03 (0.18) 0.89
12 0.30 (0.55) 0.59 0.28 (0.34) 0.41 0.28 (0.31) 0.38 0.29 (0.46) 0.54 0.30 (0.31) 0.34 0.30 (0.31) 0.34
24 1.05 (0.56) 0.06 0.98 (0.43) 0.02 0.98 (0.40) 0.02 1.03 (0.47) 0.03 1.05 (0.40) 0.009 1.05 (0.40) 0.009
36 1.10 (0.57) 0.054 1.02 (0.52) 0.049 1.04 (0.48) 0.03 1.06 (0.48) 0.03 1.10 (0.49) 0.02 1.10 (0.49) 0.02
48 0.59 (0.58) 0.31 0.72 (0.60) 0.23 0.73 (0.56) 0.19 0.61 (0.48) 0.21 0.59 (0.56) 0.30 0.59 (0.56) 0.30
60 −0.19 (0.58) 0.75 −0.25 (0.70) 0.72 −0.25 (0.66) 0.71 −0.21 (0.49) 0.67 −0.19 (0.67) 0.78 −0.19 (0.67) 0.78

Current smoking vs. 0.14 0.003 0.003 0.002 0.01 0.01


former/no smoking 0 1.49 (1.20) 0.21 1.48 (0.63) 0.02 1.49 (0.73) 0.04 1.49 (1.34) 0.27 1.49 (0.65) 0.02 1.49 (0.65) 0.02
12 2.17 (1.23) 0.08 2.11 (0.96) 0.03 2.11 (1.02) 0.04 2.13 (1.37) 0.12 2.13 (0.81) 0.009 2.17 (0.82) 0.008
24 0.26 (1.26) 0.84 −0.01 (1.21) 0.99 0.02 (1.27) 0.99 −0.01 (1.38) 0.99 0.01 (1.25) 1.00 0.26 (1.27) 0.84
36 0.32 (1.25) 0.80 0.19 (1.47) 0.90 0.25 (1.56) 0.87 0.23 (1.38) 0.87 0.24 (1.54) 0.88 0.32 (1.55) 0.84
48 0.30 (1.20) 0.81 0.29 (1.72) 0.86 0.33 (1.83) 0.86 0.34 (1.40) 0.81 0.35 (1.65) 0.83 0.30 (1.69) 0.86
60 −2.90 (1.31) 0.03 −2.61 (2.00) 0.19 −2.64 (2.12) 0.21 −2.81 (1.41) 0.046 −2.81 (2.11) 0.18 −2.90 (2.17) 0.18

−2lnL 93,445 81,293 81,083 87520


AIC 93,447 81,339 81,191 87526
QIC 11867 11868
QICu 11864 11864
Covariance parameters 1 24 43 3 2 0

Fixed effects model using SAS PROC MIXED with the REPEATED OPTION with a UN@UN correlation structure.
§
Mixed effects model using SAS PROC MIXED with the RANDOM OPTION with random effects for subject and eye within subject for both the intercept and the Visit parameter.
£
Generalized estimating equation (GEE) model.

Time was modelled as categorical with 6 levels (i.e., months 0, 12, 24, 36, 48, 60) due to the non-linear decline in visual acuity.
OPHTHALMIC EPIDEMIOLOGY
9
10 G.-S. YING ET AL.

Table 6. Visual acuity in left eye and right eye over time and their cross-sectional and longitudinal correlations (Pearson Correlation)
among eyes with AMD category 3 at baseline in the AREDS participants (N = 1463 participants, 871 bilateral, and 592 unilateral).
Time points (Months)
0 12 24 36 48 60
Left eye Mean (SD) VA score 83.6 (5.8) 82.8 (7.2) 81.4 (9.2) 80.0 (11.2) 78.6 (12.7) 76.7 (15.0)
Right eye Mean (SD) VA score 84.0 (5.7) 82.7 (7.7) 81.5 (10.1) 79.7 (12.6) 78.1 (14.7) 76.3 (16.5)
Pearson Correlation Coefficient
Left eye 0 (N = 1156) 1.00 0.42 0.33 0.30 0.29 0.26
12 (N = 1104) 1.00 0.74 0.63 0.56 0.51
24 (N = 1056) 1.00 0.81 0.72 0.63
36 (N = 1031) 1.00 0.79 0.71
48 (N = 996) 1.00 0.85
60 (N = 953) 1.00
Right eye 0 (N = 1178) 1.00 0.39 0.34 0.29 0.26 0.26
12 (N = 1122) 1.00 0.73 0.58 0.56 0.52
24 (N = 1086) 1.00 0.80 0.75 0.70
36 (N = 1061) 1.00 0.85 0.80
48 (N = 1024) 1.00 0.90
60 (N = 979) 1.00
Combined 0 (N = 2334) 1.00 0.40 0.33 0.29 0.27 0.26
12 (N = 2226) 1.00 0.73 0.60 0.56 0.52
24 (N = 2142) 1.00 0.81 0.74 0.67
36 (N = 2092) 1.00 0.82 0.76
48 (N = 2020) 1.00 0.88
60 (N = 1932) 1.00
Inter-eye correlation Pearson ρ 0.48 0.34 0.26 0.29 0.33 0.35
N of bilateral subjects 871 832 797 777 746 715

smokers (annual decline difference of approximately 0.5 correlation coefficient (0.20 to 0.49) among bilateral
letters from all models), and the difference was signifi­ cases and the longitudinal correlation coefficients (0.20
cant in the naïve model (p = .02), fixed effects model to 0.91) in visual acuity were similar to those from the
(p = .04), mixed effects model with random intercept whole AREDS sample (Table 8).
(p = .0003), but only marginally significant in the mixed The multivariable analysis results from the naïve
effects model with random intercept and slope (p = .08) model, fixed effects model, mixed effects models (ran­
and GEE models using compound symmetry (p = .054) dom intercept with and without random slope) and GEE
and working independence (p = .07). models (using compound symmetry or working inde­
The model fit was significantly better with the fixed pendence) are shown in Table 9. Similar to the analysis
effects model vs. the mixed effects model with random using the whole AREDS sample, there was no significant
intercept and random slope:Δð 2 ln LÞ ¼ 798, chi- treatment effect (p ≥ 0.12), but there was a significant
square 19 df, p < .0001, although coefficient estimates age effect (p ≤ 0.03) on VA change over time. However,
and SE’s were similar in these two models. The mixed in this small sample analysis, the smoking effect and
effects model using random intercept and the naïve hypertension effect on VA varied substantially across
model that ignores both inter-eye correlation and models. The smoking effect was statistically significant
repeated measure correlation fit the data poorly as indi­ in the mixed effect model using random intercept (cur­
cated by high −2lnL and AIC values (Table 7). The GEE rent smokers had 1.34 letters more decline annually than
models using compound symmetry fit the data slightly non-current smokers, p = .004), but was not significant
better than the working independence model in all other models, mainly due to the much large SE
(ΔQIC ¼ 11; 2df; p < 0:001Þ; although coefficients and from the other models (Table 9). The hypertension effect
SE’s were similar. was significant (hypertensive patients had 0.65 letters
less decline annually than non-hypertensive patients,
p = .04) in the naïve model, and marginally significant
Effect of sample size on the difference from various
in the mixed effects model using random intercept
models for analysis of a random sample of AREDS
(slope difference of 0.44 letters, p = .06) and GEE
participants
model using a working independence (slope difference
To evaluate the various models with smaller sample of 0.65 letters, p = .08), but was not significant in the
sizes, we analyzed the data from a random sample of fixed effects model (slope difference of 0.27 letters,
200 AREDS participants (114 bilateral) with AMD cate­ p = .51) and the mixed effect model using random
gory 3 at baseline in their eligible eyes (314 eyes). The intercept and random slope (slope difference of 0.21,
mean VA (SD), the magnitude and pattern of inter-eye p = .63). The goodness of model fit showed that the
Table 7. The comparison of results from various models for evaluating the factors association with visual acuity among eyes with AMD category 3 at baseline in AREDS Study (N = 1463
subjects, 2334 eyes, 13277 eye visits).
Naïve Model
(ignore inter-ye correlation Mixed effects§ Mixed effects§ GEE£ GEE£
and longitudinal correlations) Fixed effects┼ (Random intercept and slope) ∆ (Random intercept) (Compound symmetry) (Working independence)
Effect Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value
Intercept 85.0 (0.35) <0.0001 84.7 (0.23) <0.0001 85.0 (0.29) <0.0001 85.2 (0.46) <0.0001 85.4 (0.29) <0.0001 85.0 (0.30) <0.0001
Treatment Group 0.46 0.21 0.20 0.60 0.15 0.30
Placebo (ref) (ref) (ref) (ref) (ref) (ref)
Antioxidants only 0.02 (0.46) 0.96 0.04 (0.30) 0.90 0.30 (0.38) 0.42 0.17 (0.60) 0.77 0.26 (0.39) 0.51 0.02 (0.40) 0.96
Zinc only −0.12 (0.46) 0.80 −0.17 (0.30) 0.58 −0.10 (0.38) 0.79 −0.18 (0.60) 0.77 −0.23 (0.39) 0.52 −0.12 (0.39) 0.76
Antioxidants+zinc −0.63 (0.46) 0.17 −0.53 (0.30) 0.08 −0.51 (0.38) 0.18 −0.60 (0.60) 0.31 −0.59 (0.38) 0.13 −0.63 (0.39) 0.11
Time (Year) −1.46 (0.12) <0.0001 −1.39 (0.15) <0.0001 −1.40 (0.15) <0.0001 −1.47 (0.08) <0.0001 −1.46 (0.16) <0.0001 −1.46 (0.16) <0.0001
Age (per year) −0.34 (0.03) <0.0001 −0.34 (0.02) <0.0001 −0.34 (0.03) <0.0001 −0.34 (0.04) <0.0001 −0.34 (0.03) <0.0001 −0.34 (0.03) <0.0001
Hypertension: yes vs. No −0.59 (0.34) 0.08 −0.39 (0.22) 0.07 −0.48 (0.28) 0.08 −0.55 (0.44) 0.21 −0.59 (0.29) 0.04 −0.59 (0.29) 0.04
Current Smoking: Yes vs. no −1.99 (0.62) 0.002 −1.80 (0.40) <0.0001 −1.71 (0.51) 0.0008 −1.67 (0.81) 0.04 −1.50 (0.57) 0.01 −1.99 (0.57) 0.0005

Age*time −0.08 (0.01) <0.0001 −0.08 (0.01) <0.0001 −0.08 (0.01) <0.0001 −0.08 (0.01) <0.0001 −0.08 (0.01) <0.0001 −0.08 (0.02) <0.0001
Current smoking * time −0.52 (0.22) 0.02 −0.55 (0.27) 0.04 −0.48 (0.27) 0.08 −0.55 (0.15) 0.0003 −0.55 (0.28) 0.054 −0.52 (0.29) 0.07
Hypertension*time 0.05 (0.11) 0.64 −0.05 (0.14) 0.73 −0.07 (0.15) 0.61 −0.01 (0.08) 0.88 −0.01 (0.15) 0.94 0.05 (0.16) 0.73
Treatment group*time 0.92 0.73 0.75 0.55 0.87 0.96
Placebo (ref) (ref) (ref) (ref) (ref) (ref)
Antioxidants only −0.03 (0.16) 0.87 −0.16 (0.20) 0.41 −0.15 (0.20) 0.45 −0.06 (0.11) 0.57 −0.06 (0.20) 0.76 −0.03 (0.21) 0.91
Zinc only −0.01 (0.16) 0.95 −0.02 (0.19) 0.90 −0.03 (0.20) 0.88 −0.01 (0.11) 0.98 −0.01 (0.22) 0.99 −0.01 (0.23) 0.97
Antioxidants+zinc 0.07 (0.16) 0.63 −0.05 (0.20) 0.80 0.06 (0.20) 0.76 0.09 (0.11) 0.39 0.09 (0.21) 0.65 0.08 (0.21) 0.72

−2lnL 96518 84836 85634 91241


AIC 96520 84882 85646 91247
QIC 12814 12825
QICu 12760 12760
Covariance parameters 1 24 5 3 2 0

Fixed effects model using SAS PROC MIXED with the REPEATED OPTION with a UN@UN correlation structure.
§
Mixed effects model using SAS PROC MIXED with the RANDOM OPTION with random effects for subject and eye within subject.

Using random intercept and slope for the subject level, and random intercept for the eye level.
£
Generalized estimating equation (GEE) model.
Note: Age is normalized using mean age (i.e., age-69).
OPHTHALMIC EPIDEMIOLOGY
11
12 G.-S. YING ET AL.

Table 8. Visual acuity in left eye and right eye over time and their cross-sectional and longitudinal correlations (Pearson Correlation)
among randomly selected 200 participants in eyes with AMD category 3 at baseline in the AREDS (200 participants, 114 bilateral, 86
unilateral).
Time points (Months)
0 12 24 36 48 60
Left eye Mean (SD) VA score 83.5 (6.2) 82.2 (7.7) 81.1 (10.3) 80.1 (11.0) 79.9 (10.2) 78.0 (12.0)
Right eye Mean (SD) VA score 83.6 (5.6) 82.9 (7.1) 81.9 (10.2) 80.5 (11.5) 77.6 (16.3) 76.2 (18.4)
Left eye Pearson Correlation Coefficient
0 (N = 161) 1.00 0.34 0.20 0.23 0.33 0.29
12 (N = 155) 1.00 0.63 0.59 0.52 0.48
24 (N = 145) 1.00 0.76 0.69 0.56
36 (N = 144) 1.00 0.76 0.64
48 (N = 134) 1.00 0.89
60 (N = 131) 1.00
Right eye
0 (N = 153) 1.00 0.42 0.30 0.30 0.26 0.25
12 (N = 149) 1.00 0.69 0.59 0.50 0.44
24 (N = 137) 1.00 0.86 0.67 0.58
36 (N = 136) 1.00 0.79 0.69
48 (N = 129) 1.00 0.91
60 (N = 126) 1.00
Combined
0 (N = 314) 1.00 0.38 0.25 0.27 0.27 0.26
12 (N = 304) 1.00 0.66 0.59 0.49 0.44
24 (N = 282) 1.00 0.81 0.67 0.57
36 (N = 280) 1.00 0.77 0.67
48 (N = 263) 1.00 0.90
60 (N = 257) 1.00
Inter-eye correlation Pearson ρ 0.49 0.23 0.23 0.20 0.40 0.44
N of bilateral subjects 114 111 101 101 94 92

fixed effects model fits the data better than the mixed There are a variety of considerations when selecting
effects model using random intercept and random slope the covariance structure, including the number of para­
(Δð 2 ln LÞ ¼ 188, chi-square 19 df, p < .0001), while meters, the interpretation of the structure and the good­
the naïve model and the mixed effects model using ness of fit. Most statistical software has the capability of
random intercept fit the data much worse. The GEE fitting the data with different covariance structures
model using compound symmetry and working inde­ according to the mixed effects model.11 In choosing
pendence provided similar goodness of fit. the covariance structure for analyzing correlated data,
we can use information criteria such as −2lnL, AIC,12
and BIC.13 Ferron et al.14 found that on average, using
Discussion
the AIC led to the selection of the correct covariance
In this paper, we introduced three statistical modelling structure about 79% of the time. The −2lnL and AIC can
approaches for longitudinal correlated continuous eye also be used to compare fixed effects models vs. mixed
data including the fixed effects model, mixed effects effects models and compare mixed effects models with
model and modelling with generalized estimating equa­ vs. without random slope as we demonstrated in our
tions. We demonstrated these three modelling examples. However, there is no direct way to compare
approaches in SAS by analyzing datasets from two clin­ the goodness of fit from fixed effects and mixed effects
ical trials with two eyes in different treatment groups models vs. the GEE models, because the fixed effects and
(paired design) in one study (CAPT) and two eyes in the mixed effects models are likelihood-based methods,
same comparison groups (parallel design) in another while the GEE models are not a likelihood based
study (AREDS). We illustrated these modelling method.
approaches with different covariance structures and The QIC statistic proposed by Pan7 and further
compared their goodness of fit based on −2lnL and discussed by Hardin and Hilbe8 is analogous to the
AIC for fixed effects and mixed effects models and QIC familiar AIC statistic used for comparing the fit of
and QICu for the GEE model. models with likelihood-based methods. QIC can be
In the analysis of longitudinal data using mixed used to compare working correlation structures for
effects models, we need to choose which variables to a given GEE model.
model as fixed effects, which variables to model as ran­ The goodness of fit statistics showed the fixed effects
dom effects, and the covariance structure to account for model provides the best fit (smallest −2lnL and AIC) for
the correlations from longitudinal repeated measures. AREDS, while the mixed effects model with random visit
Table 9. The comparison of results from various models for evaluating the factors association with visual acuity among randomly selected 200 participants in eyes with AMD category 3 at
baseline in AREDS Study (N = 200 subjects, 314 eyes, 1799 eye visits).
Naïve Model
(ignore inter-ye correlation Mixed effects§ Mixed effects GEE£ GEE£
and longitudinal correlations) Fixed effects┼ (Random intercept and slope) (Random intercept) (Compound symmetry) (Working independence)
Effect Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value Beta (SE) P-value
Intercept 83.4 (0.93) <0.0001 83.0 (0.63) <0.0001 83.2 (0.79) <0.0001 83.5 (1.19) <0.0001 83.7 (0.75) <0.0001 83.4 (0.83) <0.0001
Treatment Group 0.44 0.11 0.19 0.65 0.23 0.44
Placebo (ref) (ref) (ref) (ref) (ref) (ref)
Antioxidants only 0.48 (1.24) 0.70 1.10 (0.85) 0.19 1.39 (1.06) 0.19 0.79 (1.60) 0.62 1.03 (1.08) 0.34 0.48 (1.17) 0.68
Zinc only 1.88 (1.23) 0.13 1.87 (0.84) 0.03 2.14 (1.06) 0.04 1.91 (1.59) 0.23 1.92 (0.94) 0.04 1.88 (0.99) 0.06
Antioxidants+zinc 1.28 (1.24) 0.30 1.68 (0.84) 0.048 1.75 (1.06) 0.10 1.46 (1.60) 0.36 1.52 91.08) 0.16 1.28 (1.14) 0.26
Time (Year) −1.33 (0.32) <0.0001 −1.14 (0.42) 0.007 −1.27 (0.44) 0.004 −1.32 (0.24) <0.0001 −1.31 (0.35) 0.0002 −1.33 (0.37) 0.0004
Age (per year) −0.32 (0.08) 0.0001 −0.35 (0.06) <0.0001 −0.32 (0.07) <0.0001 −0.31 (0.11) 0.004 −0.31 (0.08) 0.0001 −0.32 (0.08) <0.0001
Hypertension: yes vs. No −1.33 (0.93) 0.15 −0.79 (0.63) 0.21 −0.93 (0.80) 0.24 −1.25 (1.20) 0.30 −1.34 (0.85) 0.12 −1.33 (0.89) 0.13
Current Smoking: Yes vs. no 2.11 (1.79) 0.24 1.23 (1.21) 0.31 2.18 (1.52) 0.15 2.78 (2.29) 0.23 2.78 (2.12) 0.19 2.11 (2.17) 0.33

Age*time −0.09 (0.03) 0.002 −0.09 (0.04) 0.01 −0.09 (0.04) 0.03 −0.09 (0.02) <0.0001 −0.09 (0.04) 0.02 −0.09 (0.04) 0.03
Current smoking * time −1.04 (0.63) 0.10 −0.86 (0.79) 0.28 −1.03 (0.82) 0.21 −1.34 (0.47) 0.004 −1.33 (1.13) 0.24 −1.04 (1.21) 0.39
Hypertension*time 0.65 (0.32) 0.04 0.27 (0.41) 0.51 0.21 (0.43) 0.63 0.44 (0.23) 0.06 0.45 (0.35) 0.20 0.65 (0.37) 0.08
Treatment group*time 0.31 0.39 0.55 0.12 0.62 0.59
Placebo (ref) (ref) (ref) (ref) (ref) (ref)
Antioxidants only −0.18 (0.43) 0.68 −0.74 (0.56) 0.19 −0.58 (0.58) 0.32 −0.29 (0.32) 0.37 −0.28 (0.53) 0.59 −0.18 (0.56) 0.75
Zinc only −0.58 (0.42) 0.16 −0.66 (0.54) 0.23 −0.52 (0.57) 0.36 −0.55 (0.31) 0.07 −0.55 (0.56) 0.32 −0.58 (0.58) 0.31
Antioxidants+zinc 0.16 (0.43) 0.71 −0.05 (0.55) 0.93 0.08 (0.58) 0.90 0.11 (0.31) 0.72 0.11 (0.47) 0.81 0.16 (0.49) 0.74

−2lnL 12809 11358 11546 12212


AIC 12811 11403 11559 12218
QIC 1766 1777
QICu 1714 1714
Covariance parameters 1 24 5 3 2 0

Fixed effects model using SAS PROC MIXED with the REPEATED OPTION with a UN@UN correlation structure.
§
Mixed effects model using SAS PROC MIXED with the RANDOM OPTION with random effects for subject and eye within subject.

Using random intercept and slope for the subject level, and random intercept for the eye level.
£
Generalized estimating equation (GEE) model.
Note: Age is normalized using mean age (i.e., age-69).
OPHTHALMIC EPIDEMIOLOGY
13
14 G.-S. YING ET AL.

provides a better fit than the fixed effects model for same between any pair of two time points. The latter
CAPT. For both CAPT and AREDS, GEE with assumption was not met in the CAPT and AREDS
a compound symmetry structure provided better fit datasets. The mixed effects model assumes the same
than GEE with a working independence structure as longitudinal correlation in the left eye and right eye;
the QIC from compound symmetry was smaller. We this assumption seems to have been met in both CAPT
recommend fitting the data using various models and and AREDS. However, these assumptions may not hold
reporting the results from the model with the best fit. when the eye-specific treatment is effective in the study
For the examples in this paper, the fixed effects model eye when compared to the untreated fellow eye.
and mixed effects model using random intercept and Compared to the mixed effects model with random
random slope provided a better fit than the mixed effects intercept, the fixed effects model and mixed effects
model with random intercept and the naïve model. model using random intercept and random slope allows
However, the fixed effects model under UN@UN covar­ the variance to change over time, and allows different
iance structure usually requires longer computation longitudinal correlations between different time points;
time and sometimes the covariance matrix may be non- they thus offer more flexibility. Interestingly, in CAPT,
positive-definite or the model may not converge due to we found the mixed effects model using random visits
the large number of parameters that need to be esti­ (modelled as categorical) provided a better model fit
mated, particularly when the sample size is small. than the fixed effects model, while in AREDS, we
Mixed effects models using random intercept and slope found the fixed effects model provided better model fit
often encounter the problem of a non-positive-definite than the mixed effects model with random intercept and
G matrix due to over-parametrization from modelling slope based on the −2lnL. The mixed effects model with
time as a categorical variable (e.g., CAPT) or due to random intercept did not fit the data well, mainly
insufficient variation of outcome attributed to the ran­ because in both CAPT and AREDS, the variance and
dom effect (e.g., AREDS). For a G matrix (i.e., covar­ the longitudinal correlation was not constant across
iance matrix of random effects) to be a valid covariance follow-up time points, thus not meeting the assumptions
matrix, it must be positive-definite. If it is non-positive- of this model. Before statistical modelling of longitudi­
definite, it is recommended to remove the correspond­ nal correlated eye data, checking of the inter-eye corre­
ing random effect from the model. In our two examples, lation at each time point and the longitudinal correlation
after we removed the random intercept from the mixed for left eye and right eye separately may provide insights
effects model for the analysis of the CAPT data, and into the selection of the appropriate statistical models.
removed the random intercept for eye within subject The various statistical modelling approaches all need
in the analysis of the AREDS data, the G matrix was to have a sample size sufficient to provide robust esti­
positive-definite and provided almost identical goodness mates of regression coefficients and variance parameters.
of fit statistics as the mixed effects model with both As different models have different numbers of para­
random intercept and random slope (data not shown). meters involved in the covariance structure, the impact
In addition, we used the SAS default REML method for of smaller sample size on these models may vary. In our
the estimation of parameters for mixed effects model. analysis of data from a random sample of 200 partici­
However, we found the ML method provided extremely pants from the AREDS study, we found the differences
similar results as REML (data not shown). across various statistical models became more substan­
All statistical models have assumptions and if the tial. For example, the smoking effect on VA change over
assumptions are not met, the model can yield biased time were all statistically significant or marginally sig­
estimates of regression coefficients (SE) and invalid nificant (all p ≤ 0.08) in the analysis of the full AREDS
p-values. For the mixed effects model using random dataset, but was significant only in the mixed effects
intercept, it is assumed that the variance of the outcome model with random intercept for the analysis of
measure is the same at all time points. This assumption AREDS data from random sample of 200 subjects.
is often violated in longitudinal data as in both of our In conclusion, longitudinal models (fixed effects mod­
examples where the variance increased over time. The els, random effects models or GEE) using the eye as the
mixed effects model with random intercept also assumes unit of analysis can be implemented using available statis­
the inter-eye correlation remains constant across all tical software (SAS, Stata and R) and offer many advan­
time points and the longitudinal correlations are the tages for valid and efficient analysis of longitudinal
OPHTHALMIC EPIDEMIOLOGY 15

ophthalmologic data with both inter-eye and longitudinal 4. Zeger SL, Liang KY. Longitudinal data analysis for dis­
correlations. One issue is that the fixed effects model with crete and continuous outcomes. Biometrics. 1986;42
a UN@UN, UN@CS or UN@AR correlation is currently (1):121–130. doi:10.2307/2531248.
5. Ying GS, Maguire MG, Glynn R, Rosner B. Tutorial on
only available in SAS, while mixed effects and GEE models biostatistics: linear regression analysis of continuous
are available in other statistical packages (e.g., STATA, R). correlated eye data. Ophthalmic Epidemiol. 2017;24
Different models require different assumptions and may (2):130–140. doi:10.1080/09286586.2016.1259636.
yield different results. Goodness-fit statistics may guide the 6. Jennrich RI, Schluchter MD. Unbalanced repeated mea­
selection of the appropriate model for a specific dataset. sures models with structural covariance matrices.
Biometrics. 1986;42(4):805–820. doi:10.2307/2530695.
7. Pan W. Akaike’s information criterion in generalized
estimating equations. Biometrics. 2001;57(1):120–125.
Disclosure statement doi:10.1111/j.0006-341X.2001.00120.x.
8. Hardin JW, Hilbe JM. Generalized Estimating
All authors have no conflict of interest disclosure to disclose. Equations. Boca Raton, FL: Chapman & Hall/
CRC; 2003.
9. Kenward MG, Roger JH. Small sample inference for
fixed effects from restricted maximum likelihood.
Funding Biometrics. 1997;53(3):983–997. doi:10.2307/2533558.
Supported by grants [R01EY022445 and P30 EY01583-26] 10. Schaalje GB, McBride JB, Fellingham GW. Adequacy of
from the National Eye Institute, National Institutes of approximations to distributions of test statistics in com­
Health, Department of Health and Human Services. plex mixed linear models. J Agric Biol Environ Stat.
2002;7(4):512–524. doi:10.1198/108571102726.
11. Littell RC, Milliken GA, Stroup WW, Wolfinger RD.
SAS System for Mixed Models. 1st ed. Cary: SAS
References Institute, Incorporated; 1999.
1. The CAPT Research Group. The complications of 12. Akaike H. A new look at the statistical model
age-related macular degeneration prevention trial identification. IEEE Trans Autom Control. 1974;19
(CAPT): rationale, design and methodology. Clin Trials. (6):716–723. doi:10.1109/TAC.1974.1100705.
2004;1(1):91–107. doi:10.1191/1740774504cn007xx. 13. Schwarz G. Estimating the dimension of a model.
2. The AREDS Research Group. The age-related eye disease Ann Stat. 1978;6(2):461–464. doi:10.1214/aos/11
study (AREDS): design implications. AREDS report no. 1. 76344136.
Control Clin Trials. 1999;20(6):573–600. doi:10.1016/S0197- 14. Ferron J, Dailey R, Yi Q. Effects of misspecifying the
2456(99)00031-8. first-level error structure in two-level models of change.
3. Laird NM, Ware JH. Random-effects models for longitudinal Multivariate Behav Res. 2002;37(3):379–403. doi:10.1207/
data. Biometrics. 1982;38(4):963–974. doi:10.2307/2529876. S15327906MBR3703_4.

You might also like