Lecture 1
Lecture 1
Lecture 1
(Chapter 1)
Introduction
• This course describes statistical methods
for the analysis of longitudinal data, with a
strong emphasis on applications in the
biological and health sciences
– Univariate statistics: each subject gives rise
to a single measurement, termed response
– Multivariate statistics: each subject gives
rise to a vector of measurements, or different
responses
– Longitudinal data: each subject gives rise to
a vector of measurements, but these represent
the same response measured at a sequence of
observation times
• Repeated responses over time on independent units
(persons)
Topics
• Basic issues and exploratory analyses
– Definition and examples of LDA
– Approaches to LDA
– Exploring correlation
• Statistical methods for continuous measurements
– General Linear Model with correlated data
• Weighted Least Squares estimation
• Maximum Likelihood estimation
• Parametric models for covariance structure
• Generalized linear models for continuous/discrete
responses
– Marginal Models
– Log Linear Model and Poisson Model for count responses
– Logistic model for binary responses
– GEE estimation methods
– Estimation techniques
– Random Effects Models (Multi-level models)
– Transition Models
Introduction
• Longitudinal study: people are
measured repeatedly over time
• Cross-sectional study: a single
outcome is measured for each individual
• In LDA, we can investigate:
– changes over time within individuals (ageing
effects)
– differences among people in their baseline
levels (cohort effects)
• LDA requires special statistical methods
because the set of observations on one
subject tends to be inter-correlated
• Reading ability appears to be poorer
among older children.
• Let’s assume that the reading ability of
each child has been measured twice.
• LDA can distinguish between changes over
time within individual & differences among
people in their baseline levels
Why special methods?
• Repeated observations
Challenges
• Repeated observations tend to be auto-
correlated
Types of Studies
• Time series studies
• Panel studies (Sociology & Economics)
• Prospective studies (Clinical Trials)
Examples
Goals:
1.Characterize the typical time course of
CD4+ cell depletion
2.Identify factors which predict CD4+ cell
changes
3.Estimate the average time course of
CD4+ cell depletion
4.Characterize the degree of heterogeneity
across men in the rate of progression
• This figure displays 2376 values of CD4+
cell numbers plotted against time since
sero-conversion for 369 infected men
enrolled in the MACS study.
Ex: Growth of trees
• Mixed (27
cows)
• Lupins (27
cows)
Ex: Indonesian Children’s Health Study
(ICHS)
Goals:
1. Estimate the increase in risk of
respiratory infection for children who are
vitamin A deficient while controlling for
other demographic factors
2. Estimate the degree of heterogeneity in
the risk of disease among children
Ex: ICHS (cont’d)
• Progabide
What do these examples have in
common?
• These are repeated observations on each
experimental unit
• Units can be assumed independent of one
another
• Multiple responses within each unit are
likely to be correlated
• The objectives can be formulated as
regression problems whose purpose is to
describe the dependence of the response
on explanatory variables
• The choice of the statistical model must
depend on the type of the outcome variable
Course Overview
Regression Model
Suppose:
Cross-sectional versus Longitudinal Study