Basic Epidemiologic and Biostatistical Terminology For

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 15

BASIC EPIDEMIOLOGIC AND BIOSTATISTICAL TERMINOLOGY FOR

NANOS 2004 BIOSTATISTICS WET LAB

Laura J. Balcer, M.D., M.S.C.E.


Associate Professor of Neurology
3 E. Gates
3400 Spruce Street
Philadelphia, PA 19104
(215) 349-8072
fax (215) 349-5579
[email protected]

Key words:
1. epidemiology
2. biostatistics
3. study designs
4. diagnostic tests
5. data analysis

Biostatistics Wet Lab

Page 2

OBJECTIVES

1. Identify and apply terminology related to epidemiologic study designs, measures of disease
frequency/association, evaluation of diagnostic tests, and principles for evaluating epidemiologic
study results;
2. Apply and identify appropriate statistical methods and approaches for data analysis based on the
research question, epidemiologic study design, types of variables/measurement scales used, and
data characteristics (including survival analysis);
3. Use a hands-on approach with statistical software to apply basic statistical and epidemiologic
concepts for data analysis using case studies and data sets related to neuro-ophthalmologic
research questions.

Biostatistics Wet Lab

Page 3

EPIDEMIOLOGY
I. Definition
The field of epidemiology is concerned with the study of the distribution and determinants of disease
frequency in human populations (1-2). These three closely interrelated concepts---distribution, determinants, and
frequency---provide the foundation for all epidemiologic principles and methods (2).
II. Design Strategies for Epidemiologic Disease
Descriptive Study Designs
The purpose of descriptive studies is to report the general characteristics and patterns of a disease with
respect to person, place, and time. Such studies are important for setting priorities for health care and in
generating hypotheses for future investigations (1). There are three basic types of descriptive studies:
Case Reports and Case Series a description of one or more individuals, documenting a unique or
unusual occurrence or medical condition; often generate hypotheses regarding causal factors that may be
associated with the observed outcome (1)
Correlational Studies use data from entire populations to compare disease frequencies between different
groups during the same period of time on in the same population at different points in time; useful for
formulation of hypotheses but cannot be used to establish causality (2)
Cross-Sectional Surveys involve the simultaneous assessment of exposures and disease in populations or
individuals; sample data at a single point in time (snapshot) (1)
Analytic Study Designs
Analytic studies involve explicit comparisons of groups with respect to the presence/absence of exposures
and disease (1-2). Such studies are designed to test hypotheses. Outcome data (presence/absence of
exposures/disease) for analytic studies may be ascertained prospectively or retrospectively. Analytic study
designs may be observational (investigator records data based on observed characteristics) or interventional
(A.K.A. experimental studies, clinical trials):
Types of Outcome Ascertainment
Prospective outcome of interest (exposure/disease) has not occurred at the time of study initiation
Retrospective outcome of interest (exposure/disease) has already occurred at the time of initiation
Observational Studies
Case-Control Studies patients who have a disease/outcome of interest (case group) are compared with
a control group with respect to the proportions of individuals who have had prior exposures/factors (potential risk
factors for the disease of interest, or potential prognostic factors for the outcome of interest that were present prior
to the initiation of the study)---patients classified at initiation of study based on presence/absence of
disease/outcome of interest; useful for evaluating potential risk factors/exposures for rare diseases
Cohort Studies patients are classified at the initiation of the study with respect to the presence or
absence of exposures/factors (potential prognostic factors or risk factors) and then followed for a specified period

Biostatistics Wet Lab

Page 4

of time to determine the development of disease or outcome of interest; clinical trials (interventional studies) may
be viewed as a type of cohort study (exposure = treatment)
Interventional Studies
Clinical Trials studies with prospective ascertainment designed to compare the effect of an intervention
or treatment with that of a control (placebo) or standard yet partially effective treatment; when the
treatment/intervention is randomly allocated, these studies are referred to as a randomized clinical trials (1)
Meta-Analyses statistical process by which the results of several studies, frequently randomized clinical
trials, are combined to develop a single estimate of the effect of an intervention, treatment, or exposure on
disease (1)
III. Measures of Disease Frequency and Association
Measures used to describe epidemiologic outcomes focus on the quantification of disease occurrence
(measures of frequency) or on the relation between specific exposures or characteristics and the disease
under investigation (measures of association) (1).
Measures of Frequency
Prevalence proportion of individuals in a population who have a disease/outcome at a specific point in
time (provides an estimate of the probability of disease at a point in time) (1-2):
Prevalence =

number of existing cases of a disease at a given point in time


total population at risk

Incidence number of new events (outcomes of interest) that develop in a population of individuals at
risk during a specified time interval (incidence is a rate of development of the event/outcome of interest);
cumulative incidence (CI) quantifies the number of individuals who become diseased during a specified time
period:
CI =

number of new cases during a given time period


total population at risk

Measures of Association
Risk probability of developing an outcome/disease of interest, estimated by p; to determine risk, data
are usually organized in a 2 x 2 table (contingency table), where a, b, c, and d represent numbers of individuals
with disease/exposure of interest, and N represents the total number of individuals in the study (1):
Disease

Exposure

Yes

No

Total

Yes

a+b

No

c
d
________________________

c+d

Total

a+c

b+d

Biostatistics Wet Lab

Page 5

Relative Risk (RR) probability of developing an outcome of interest in one study group (p 1) divided by
the probability of the same outcome in another study group (p 2); the most common measure of relative effect,
estimated by:
p1
a/(a+b) proportion of individuals with disease among exposed, 2x2 table
RR =
=
p2
c/(c+d) proportion of individuals with disease among nonexposed, 2x2 table
Odds Ratio (OR) used to estimate relative risk in case-control studies; if p1, p2 = the probabilities of an
outcome in study groups 1 and 2, respectively, then:
p1/(1 p1) odds in favor of the outcome in group 1
OR =
p2/(1 p2) odds in favor of the outcome in group 2

a/c
=

ad
=

b/d

, using 2x2 table


bc

Attributable Risk (AR) difference between the incidence of disease in exposed and nonexposed
groups; an estimate of excess risk or risk difference:
a
AR =

a+b

c+d

IV. Principles for Evaluating a Study Result


Evaluating Association
While the results of any epidemiologic study may reflect a true association between an
exposure/intervention and disease/outcome, measures of association must be interpreted in terms of the potential
roles for chance, bias, and confounding (2). Epidemiologic studies must also be evaluated with respect to whether
the results are applicable to other populations or to the population of interest (generalizability), and whether the
results are valid.
Chance the possibility that a given study result or association observed in a sample from a population
may have been observed due to luck of the draw; the role of chance is influenced most strongly by sample size
Bias systematic error or difference between groups in a study or between the study group and the
population of interest with respect to the way individuals were selected for the study or the way in which
information was obtained
Selection Bias results from a systematic difference in inclusion criteria between groups
in a study or between the study sample and the population of interest (i.e., referral bias)
Observation Bias results from a systematic difference in the way information is
obtained from different study groups (i.e., recall bias, interviewer bias)
Confounding the presence of an additional variable(s) that are associated with the exposure/risk
factor/prognostic factor of interest but also independently affect development of the outcome/disease

Biostatistics Wet Lab

Page 6

Validity extent to which the results of a study reflect the disease process or outcome of interest (are we
measuring what we think were measuring?)
Generalizability extent to which the results of a study are applicable to other populations and/or to the
population of interest in general
Evaluating Causality
Once the potential roles of chance, bias, and confounding have been evaluated for a given epidemiologic
study, and a valid statistical association between exposure/intervention and disease/outcome has been determined
to exist, the following factors must be considered in determining whether a cause-effect relationship is likely (2):
Strength of Association the greater the magnitude of the increased (or decreased) risk observed, the
greater the likelihood of a cause-effect relationship (and it is less likely that unknown confounding factors have
produced the observed result)
Biologic Plausibility is there a known or postulated biologic mechanism by which the
exposure/intervention might alter risk of a disease/outcome?
Consistency with Previous Studies evidence for a cause-effect relation is stronger if a number of
studies, conducted by different investigators at various times/sites, demonstrate similar results
Temporal Relationship it should be clear that the exposure/intervention of interest preceded the
disease/outcome by a period of time consistent with proposed biologic mechanisms
Dose-Response Relationship observation that a greater degree of risk of disease/outcome is associated
with a greater degree of exposure/intervention
V. Evaluating a Diagnostic or Screening Test
The validity of a test used to diagnose or screen individuals for disease or exposure is measured by its
capacity to correctly categorize persons who have disease (or pre-clinical disease) as test-positive and those
without disease (or pre-clinical disease) as test-negative (2). The relation between the actual presence of disease,
as determined by a gold standard test, and the results of a candidate diagnostic or screening test is usually
determined using a 2x2 table as follows:
Disease Status (Dx) (Truth)

Results of
Screening
Test (T)

Positive

Negative

Total

Positive

a+b

Negative

c
d
________________________

c+d

Total

a+c

b+d

Sensitivity = probability of having a positive test (T+) if disease is actually present = a/(a+c)
Specificity = probability of having a negative test (T-) if disease is not present = d/(b+d)

Biostatistics Wet Lab

Page 7

Positive Predictive Value = probability that disease is actually present if T+ = a/(a+b)


Negative Predictive Value = probability that disease is not present if T- = d/(c+d)
True Positives = number of individuals with T+ who actually have disease = a
False Positives = number of individuals with T+ who do not have disease = b
False Negatives = number of individuals with T- who actually have disease = c
True Negatives = number of individuals with T- who do not have disease = d

BIOSTATISTICS
I. Approach to Statistical Analysis
Statistics is a branch of mathematics that deals with the organization, summary, and analysis of data (1).
These three steps are essential in order for data to be useful as measures of group performance, and so that the
meaning or implications of data to a population of individuals may be communicated:
Statistics = data organization + data summary + data analysis
Before these steps in statistical analysis can be completed, however, we must also:
1. Define the research questions/objectives the most important yet often most difficult step in the
research process; determines the direction of subsequent study design, including data analysis
2. Select the epidemiologic study design, variables, and outcome measures selected based on the
research question; characteristics of study design, variables, and outcome measures determine what
statistical methods are appropriate
*** What Statistics Do (Not) for Us
1. Statistics DO tell us how likely it is that a result/difference (or one as extreme) from a study
sample may have been observed by chance alone
2. Statistics do NOT tell us if an observed result/difference is clinically significant, generalizable,
or biologically plausible
II. Measurement
Once an epidemiologic study design is chosen, consideration must be given to the types of variables and
measurement scales to be used to assess outcomes. The types of variables and measurement scales used will
determine the types of statistical methods and tests used for data summary and analysis:
Types of Variables
Continuous can theoretically take on any value along a continuum, including fractional values; limited
only by precision of the measuring device (i.e., time)

Biostatistics Wet Lab

Page 8

Categorical can be described only in whole units (integers); may be discrete (i.e., number of children in
a family) or dichotomous (can take on 2 values only---i.e., yes or no)
Rules of Measurement Scales
Nominal numbers on scale represent category labels only (i.e., third, fourth, or sixth nerve palsy)
Ordinal numbers on scale indicate rank order, but intervals between numbers not known or consistent
(i.e. Apgar score)
Interval numbers on scale indicate rank order and have equal intervals between, but no true zero point
can be identified (i.e., temperature in degrees Fahrenheit)
Ratio numbers on scale indicate rank order, have equal intervals, and a true zero point can be identified
(i.e., weight)
Is a Measurement Accurate/Meaningful?
Once we have completed our measurements, using the highest level of scaling possible given the
variables and outcomes of interest, the usefulness of these measurements then depends on the extent to which
clinicians can rely on the data as accurate and meaningful:
Reliability is the measurement consistent and free from error (reproducible) between tests (test-retest
reliability) or examiners (inter-rater reliability)?
Validity is the test measuring what it is intended to measure, and is it useful for discriminating,
evaluating, and predicting outcomes?
Reliability Coefficients
Intraclass Correlation Coefficient (ICC) indicates the proportion of total variability in a data set that
is between-patient---the rest is within-patient variability that reflects variability in test results (the higher the
ICC, the more reliable the measurement or test); the ICC is useful for continuous variables
Chance-corrected percent agreement (kappa) indicates percent agreement between examiners or tests
for dichotomous variables
Cronbachs Alpha measures internal consistency, or the extent to which items in a questionnaire
measure various aspects of the same characteristic
III. Descriptive Statistics: Characterization and Exploration of Data
Once we have determined that our outcome measurements have demonstrated reliability and validity
(based on previous literature or current studies), and data collection has been completed, the next step is
to perform exploratory analyses to summarize and characterize data. Descriptive statistics are used to
characterize location, shape, central tendency, and variability within a data set:

Biostatistics Wet Lab

Page 9

Location and Shape of Data


Frequency Distribution the most common method used to summarize data; presented as a table of rankordered scores indicating the number of times each value occurs, or its frequency (f):
Score (X)
9
10
11
12
13

Frequency (f)
1
2
4
2
1
f = n = 10

%
10.0
20.0
40.0
20.0
10.0
100.0

Cumulative %
10.0
30.0
70.0
90.0
100.0

Histogram graphic representation of data that is useful for communicating information about location,
shape, and central tendency (bar graph with frequencies on y-axis and values on x-axis).
Stem and Leaf Diagram similar to vertical histogram but has one leaf (number) for each data point
(does not lose individual observations)
Box Plot A.K.A. box and whisker plot; demonstrates shape and location but also mean, median,
interquartile range, outliers (see below)
Measures of Central Tendency
While frequency distributions and graphs enable us to identify patterns in data (shape and location), they
do not provide a summary measure. Values that summarize the central tendency of a continuous variable (interval
or ratio scale) include the following:
Mean the arithmetic average of all observations = Xi / n , where n = number of observations
Median 50th percentile, middle observation (average of middle 2 observations if n is even); useful when
shape of data is skewed or not symmetrical---not as influenced by extreme values
Mode most frequently observed value
Measures of Variability
The shape and central tendency of data are useful characteristics, but measures of variability (or degree of
dispersion of data) are also essential:
Variance the sum of squares of the deviations of each individual observation (X i) from the mean:
s2 = (Xi mean)2 ---not as useful as standard deviation
Standard Deviation (SD) the square root of the variance: (s2) =
Range (maximum minimum observations)
Interquartile Range (75th 25th percentile observations)

Biostatistics Wet Lab

Page 10

Measures of Shape
Number of peaks (0, 1, 2)
Tails (fat or thin)
Symmetry does data fit normal distribution (symmetrical bell-shaped curve)?
Skewness where/how far is peak vs. mean?
Right-skewed - mean > median
Left-skewed - mean < median
Normal distribution - mean median
Kurtosis how peaked is the peak?
IV. Inferential Statistics: Estimation from Sample Data
While descriptive statistics are useful for summarizing important features of data, including location,
shape, central tendency, and variability, inferential statistics allow us to estimate population
characteristics from sample data, and to test hypotheses. The success of this process is dependent upon
underlying assumptions based on the concepts of probability and sampling error:
Probability are observed results/differences likely to be representative of the population of interest, or
could they have occurred by chance alone (p-value)?
Sampling Error tendency for values/results from a sample (Xs) to differ from population values (s)
Standard Error of the Mean (X) estimate of standard deviation of population means (s)
Confidence Interval (CI) range of scores that should (x% of the time) contain population mean (); 95%
CI = the interval within which 95% of all sample means would fall if a measurement were performed on a large
number of samples for a given sample size
V. Hypothesis Testing
Inferential statistics are also used to test hypotheses; such hypotheses are generated to answer questions
about comparisons of groups or relations between exposures and disease. The fundamental hypothesis used in
statistics is the null hypothesis:
Null Hypothesis (Ho) - Ho is the statistical hypothesis that observed differences or results are due to chance alone
(i.e., A = B, where = population mean); this is the devils advocate of hypotheses. We are looking to
reject the null hypothesis (Ho), and whether or not we can reject Ho depends on where we set the levels of
Type I and II error:
Type I error () probability of rejecting Ho if Ho is true (probability of detecting a difference when none
actually exists)

Biostatistics Wet Lab

Page 11

Type II error () probability of accepting Ho if Ho is false (probability of not detecting a difference


when one actually exists)
Level of Significance () usually set = 0.05 (or = 0.01 if multiple comparisons are performed)
p-value probability that an observed difference/result is due to chance alone; observed results with pvalue < are considered statistically significant (and we can reject Ho)
Statistical Power (1-) probability of observing a statistically significant difference/result given the
effect size (magnitude of observed difference/result), sample size, level, and variance; usually set = 0.20, or
power = 0.80, for sample size calculations
Basic Ingredients for Statistical Testing
All statistical tests have several basic ingredients or characteristics, as follows:
Test Statistic z statistic, t statistic, 2 (chi-square) statistic; these are ratios used to determine if a
significant difference has been attained, by establishing the probability (based on and sample size) that a test
result would occur by chance
Critical Value value of the test statistic that determines statistical significance
One-Tailed or Two-? direction of the difference is usually not known, so two-tailed is used
Degrees of Freedom (df) the number of components that are free to vary within a data set; based on
number of observations
Parametric vs. Non-Parametric Tests
Parametric Tests based upon underlying assumptions of normality (underlying normal distribution for
data), equal variances between comparison groups, and interval scaling for variables (i.e., t-test, analysis of
variance-ANOVA)
Non-Parametric Tests used when data do not meet underlying assumptions for normality (skewed
distributions), when sample sizes are small, and scaling for variables is ordinal (i.e., Wilcoxon rank-sum test)
VI. Catalog of Common Statistical Tests
Parametric Tests
Test
t-test

Use
Comparison of
2 means

Statistic
t statistic

Various Types
Unpaired (independent)
Equal/unequal variances
Paired

Analysis of
Variance

Comparison of
> 2 groups/
conditions (means)

F statistic

1-/2-/3-way
Repeated measures

Multiple

Pairwise comparison

Minimum

Bonferroni t-test

Biostatistics Wet Lab


Comparison
Tests

Page 12
of > 2 means
(planned/unplanned)

significant
difference-MSD

Scheff comparison

Test
Wilcoxon
Rank-Sum
Test

Use
Comparison of
2 independent
samples

Statistic
U statistic
(A.K.A.
Mann-Whitney
U-test)

Parametric Analogue
Unpaired/independent
t-test

Sign Test

Comparison of
2 paired samples
(ordinal/nominal)

z statistic
(large samples)

Paired t-test

Wilcoxon
Signed-Ranks
Test

Comparison of
2 paired samples

T statistic

Paired t-test

KruskalWallis 1-Way
ANOVA
by Ranks

Comparison of
3 groups/
conditions

H statistic

1-Way ANOVA

Friedman
2-Way
ANOVA
by Ranks

Comparison of
3 groups/
conditions w/
repeated measures
(ordinal data)

2r statistic

Repeated measures
ANOVA

Non-Parametric Tests

VII. Correlation and Regression


Correlation and regression techniques are used to examine how variables are related (correlation) or may
predict other variables (regression analysis):
Correlation Coefficients
Test
Pearson
Product-Moment
Correlation
Coefficient

Use
Tests relative strength of
relation between 2
variables (parametric)

Statistic
r statistic
(tests r 0)

Spearman
Rank Correlation
Coefficient

Tests relative strength of


relation between 2
variables (non-parametric)

rs statistic
(tests rs 0)

Biostatistics Wet Lab

Page 13

Regression Techniques
Test
Linear
Regression

Use
Tests relation of independent
(predictor) variables (X) to
dependent variables (Y)

Assumptions
Linear correlation of X,Y:
Y = a + bX

Analysis of
Covariance
(ANCOVA)

Measures 1 or more
confounding factors +
dependent variable

Linearity of covariate
Independence of covariate
Reliability of covariate

Logistic
Regression

Tests relation of predictor


variables to binomial
dependent variable

Predicts probability of
outcome:
p = 1/(1 + e (a + bx+))

VIII. Analysis of Frequencies: Chi-Square Tests


Chi-square and related tests are used to analyze data in which binomial proportions are compared:
Test
Contingency
Table Analysis
(2)

Use
Tests relation/independence
of 2 categorical variables
(independent samples)

Assumptions
2 statistic based on
observed vs. expected
frequencies

Fishers
Exact Test

Correction of 2 for
expected frequencies < 5

Calculates exact probabilities


for observed frequencies

McNemars
Test

Tests relation/independence
of 2 categorical variables
(matched-pair data)

Samples must be correlated

IX. Survival AnalysisFor the Rest of Us!


Survival analysis assesses the risk or probability of an event or outcome of interest over time, in contrast
to other statistical techniques (such as simple proportions) that assess the cumulative risk of an event at a
particular point in time. Using survival analysis we can generate curves (survival curves) that estimate the
probabilities of an outcome/event over time. This is in contrast to data that is presented in terms of proportions or
percentages of individuals for whom the outcome of interest occurs during various time intervals.
Reasons for Use of Survival Analysis
1. Investigators frequently must analyze data before all patients have reached the outcome of interest.
Observations for patients/ individuals in whom the outcome of interest has not yet occurred at the time of data
analysis are referred to as censored observations; these observations are accounted for in survival analysis
2. Patients do not typically enter studies or begin treatment at the same time

Biostatistics Wet Lab

Page 14

3. The risk of an event/outcome of interest is often not constant throughout the study period
Methods of Survival Curve Estimation
Life Table (Actuarial) Method cumulative probabilities (also known as hazard functions or hazard
rates) of outcome-free survival are calculated for successive defined time intervals
Kaplan-Meier Product Limit Method cumulative probabilities of outcome-free survival are
estimated at the time of each outcome occurrence (time since entry into study is not divided into intervals for
analysis)
Statistical Tests Using Survival Data
Comparison of Survival Curves log-rank test (frequently used, can estimate the odds ratio for the risk
of developing the outcome of interest), generalized Wilcoxon test
Relation of Survival to Prognostic Factors Cox proportional-hazards model (may be thought of as an
extension of logistic regression analysis in which the time to when an outcome occurs, rather than simply whether
an outcome occurs, is taken into account). This method is thus used to examine how multiple potential prognostic
factors (such as age, disease severity) may predict the probability of outcome-free survival over time
X. Best to Bother Your Biostatistician
A biostatistician should be consulted to provide expertise on design, data collection/management, and
analysis for most epidemiologic studies, particularly under the following circumstances:
1.
2.
3.
4.
5.

More than 2 groups for comparison


Correlated data (paired data, inter-eye correlations)
Many outcome variables, subgroup analyses
Small sample sizes
Before data collection!

Biostatistics Wet Lab

Page 15

SELECTED REFERENCES
Textbooks:
1. Portney LG, Watkins MP: Foundations of Clinical Research: Applications to Clinical Practice. Upper
Saddle River, NJ, Prentice Hall, Inc., 2000.
2. Hennekens CH, Buring JE, Mayrent SL: Epidemiology in Medicine. Boston, MA, Little, Brown, and
Co., 1987.
3. Castle WM, North PM: Statistics in Small Doses, ed. 3. New York, NY, Churchill Livingstone, 1995.
4. Glantz SA: Primer of Biostatistics, ed. 4. New York, NY: McGraw-Hill, 1997.
5. Dawson-Saunders B, Trapp RG: Basic and Clinical Biostatistics. East Norwalk, CT, Appleton & Lange,
1990.
6. Streiner DL, Norman GR: Health Measurement Scales: A Practical Guide to their Development and Use.
New York, NY, Oxford University Press, 1995.
Statistical Issues for Ophthalmologic Studies/Articles:
7. Gauderman WJ, Barlow WE: Sample size calculations for ophthalmologic studies, Arch Ophthalmol
110:690-692, 1992.
8. Katz J: Two eyes or one? The data analysts dilemma, Ophthalmic Surgery 19:585-589, 1988.
9. Bailey IL, Bullimore MA, Raasch TW, Taylor H: Clinical grading and the effects of scaling, Invest
Ophthalmol Vis Sci 32: 422-432, 1991.
10. Brown GW: Errors, Types II and II, Am J Dis Child 137:586-591, 1983.
11. OBrien PC, Shampo MA: Statistics for clinicians. 1-12. (Series), Mayo Clin Proc 56:47-49, 126-128,
196-197, 274-276, 324-326, 393-394, 452-454, 513-515, 573-575, 639-640, 709-711, 753-754, 1981.
12. OBrien PC, Shampo MA: Statistical considerations for performing multiple tests in a single experiment.
1-6. (Series), Mayo Clin Proc 63:813-815, 816-820, 918-920, 1043-1045, 1140-1143, 1245-1250, 1988 .

You might also like