23 Novl CA - Co Studies

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 59

Dr. P. P.

Doke Professor, Department of Preventive and Social Medicine MGM Medical College, Kamothe, Navi Mumbai
M.D., DNB., Ph.D., FIPHA

The case-control study is an analytic epidemiologic research design in which the study population consists of groups who either have (cases) or do not have a particular health problem or outcome (controls) The investigator looks back in time to measure exposure of the study subjects. The exposure is then compared among cases and controls to determine if the exposure could account for the health condition of the cases

Case-Referent Case-Compeer

Retrospective ?

Observational / Non-experimental Occasionally Exploratory Explanatory (Analytical) Retrospective Effect to Cause Both Exposure & Disease have already occurred Uses Comparison Group

Consider some rare disease say some cancer (leukemia)


Crude Annual Incidence = 3.4/100000 (< 15 years) Cohort Study: A year of observation on a million children to identify 34 cases Sample of 34 cases may be available in hospitals : may be sub-divided in 2 or more exposure categories Easy to carry conduct case-control study

Long

induction

period

between

the

exposure and clinical onset of disease

Cohort Study: Waiting years for accrual of cases Case-Control Study: Compress time Case-Control Studies hence suitable for

Chronic Diseases (Cancer / Cardiovascular


Diseases)

RCT: Methodological Standard of Excellence However, Case-Control; Not only SIMPLE to perform but some times the ONLY approach to solve a problem. Philosophically no design is Gold Standard. Understand strengths and weaknesses . Select appropriate study design to address your Research Question

Directionality Outcome to exposure 2. Timing Retrospective for exposure, but caseascertainment can be either retrospective or concurrent 3. Sampling Almost always on outcome, with matching of controls to cases
1.

Exposed

Not Exposed

Exposed

Not Exposed

Disease

No Disease

CASES

CONTROLS

With a Specific Outcome:


Presence of Disease / Syndrome Complications / progression of Disease (Severe dehydration crisis) Death (Neonatal mortality) Serum cholesterol / Birth weight Delayed Immunization Early Initiation of Cigarette Smoking Adverse Reactions of Drugs / Vaccines (SIDS) Behavior (Juvenile Delinquency) Drug Resistance (MDR-TB) Couple as a case (Infertility)

Diagnostic Criteria
Risk of Disease Misclassification Continuous / Discrete Outcome Variable
Relatively simple & straightforward: Children with cleft palates (physical

examination)
Sometimes difficult: Hypertension

Diagnosis: Combination of methods Rationale / Logical

Criteria Specific
Operational versus Rigid Standard Definition (WHO, CDC, etc) Reference (growth references NCHS, CDC, New WHO)

Eligibility Criteria

Inclusion/Exclusion criteria Ca-Co studies should be limited to incident cases :


Exposures are presumably more recent and therefore more reliably recalled. Relatively homogeneous group Exclusion of prevalent cases: Minimize the Selection Bias (Neyman Fallacy).

Ex: PID and IUD Use

Women who are not sexually active or who have had a tubal ligation are not likely to have recently used any contraceptive method including IUDs

Conceptual definition
Obesity defined as body fat percentage > 33%

Operational definition
Body Mass Index > 30

Case definition should avoid misclassification


For example: Anemia was defined as Hemoglobin < 110 gm/L as measured by WHO Colour Scale WHO Colour Scale over-estimates the hemoglobin Misclassified cases with mild anemia Also, studying mild forms of cases, gives larger case group; but misclassifies cases as noncases OR non-cases as cases as early diagnosis is generally imprecise

A severe case definition may exclude people who have been cured or who died of disease before the condition was severe enough to be labelled as case Standard/consensus definitions if available, must be used
For example, Lack of agreement over definition may introduce variability in estimates of effect
Rheumatoid arthritis Rome criteria, NY criteria, 1987 ARC criteria

The issues of severity, diagnostic criteria and subjectivity of criteria all lead to potential problems of misclassification of cases The researcher can choose between more restrictive and inclusive definitions Think in terms of sensitivity and specificity of definition and its effect on validity, sample size, precision and power It is observed that;
Restrictive definition (less sensitive) leads to lack of precision and power by reducing sample size Broad criteria (less specificity) produce misclassification leading to biased measure of effect So, weigh validity - specificity over sensitivity (Restrictive definition over inclusive definition)

Hospitals (Multi-Centric Studies) Community

Industrial Population

The goal is to
Ensure that all true cases have an equal probability of entering the study and that no false cases enter Example: Conceptual definition of HIV
Factors affecting decision to test/access the test and Sn & Sp of test will decide who eventually becomes a case under operational definition Selection bias ??

Selection bias

Berksons bias

Unequal chance of getting into study Variable rate of hospitalization affecting case selection Incident case Vs prevalent case

Neyman fallacy Detection bias

Due to closer medical attention, detection of endometrial cancer was more in a group using estrogen

1.

Representativeness: Ideally, cases should be a random sample of all cases of interest in the source population (e.g. from vital data, registry data). More commonly they are a selection of available cases from a medical care facility. (e.g. from hospitals, clinics)

2. Method of Selection

Selection may be from incidence or prevalence case: Incident cases are those derived from ongoing-ascertainment of cases over time. Prevalent cases are derived from a cross-sectional survey.

Who is the best control? What universe should controls come from?
If cases are a random sample of cases in the population. Then controls should be a random sample of all non-cases in the population sampled at the same time.

Comparability is more important than


representativeness in the selection of controls The control should be at risk of the disease The control should resemble the case in all respects except for the presence of disease

(and any as yet undiscovered risk factors for


disease)

Usually, cases in a case-control


study are not a random sample of all

cases in the population. And if so, the


controls must be selected in the same way (and with the same biases) as the cases.

Comparability vs. Representativeness


If follows from the above, that a pool of
potential controls must be defined. This is

a universe of people from whom controls


may be selected (study base).

1.
2. 3.

The study base


4.

Deconfounding Comparable accuracy Efficiency

Source of case and the control should be the same

Similar misclassification errors in cases & controls Same potential of recall bias in cases & control

Hospital or clinic control Dead control Controls with similar diseases Peer or case-nominated (friend/neighbor) control Population controls

Readily available hence commonly used Main reasons to use hospital controls are
To select controls whose referral pattern is similar to cases To obtain similar quality of examination For convenience

May not be representative of the population

Might use dead controls for dead cases In some situations, this might lead to use of surrogate informant The problem is the dead control is not representative of the living population McLaughlin compared dead controls with living controls and noticed that the dead controls smoked more cigarettes and consumed more alcohol than living controls Appropriateness depends on the exposure being studied

Reasons
To minimize the recall bias To minimize the interviewer bias To examine the specificity of an exposure for a particular type of cancer For practical but unspecified reasons

Problem ??

Neighborhood controls is used in two ways:


To refer to community or population controls To refer to controls selected from finite number of close neighbors

Friend or neighbor control is a surrogate for matching on age, education, etc


A quick way to find control Bias is introduced if determinants of friendship are associated with disease or exposure
Friends share many risk behaviors

Search starts from house of the case and door-to-door search conducted for eligible controls in a standardized pattern

Randomly drawn from population Truly representative of population Ideal way of selecting controls Practically, very difficult to carry out Study base ???

Way the pros and cons Analyze the situation for bias being introduced If possible,
select different sources of controls and compare with each other Compare the inferences drawn

Statistical consideration

When the number of subjects available in one group (cases) is limited, an increase in the other group increases the study power Gain in power is till the ratio of 4:1 Thereafter, the gain is not substantial but cost increases When the study of power with equal allocation is as high as 0.9 or as low as 0.1, additional fails to increase the power

Validity of inferences

Even when there is no statistical need, more than one control may be recruited per case Enrolling two or more types of controls is a way of checking for biases introduced by choice of control group If the measure of effect is similar when comparing cases with each control group
Probably no biases (no surety) If different measure of effect, then the bias is there and the researcher can understand it

MATCHING
Purpose: To adjust - effects of relevant confounders Matching in Design - Accounted in Analysis Misconception: The goal is to make the case and control groups similar in all respects, except for disease status An Optimal Matching Scheme involves only those variables which improve statistical efficiency or eliminate bias from the effect of interest

MATCHING
Which variables are appropriate for matching? Risk factors from prior work may be identified for matching Matching by interviewer or hospital may be used to balance out the effects of interviewer and observer errors It is best to limit matching to basic descriptors (age, sex, socio-economic status, etc) Non-modifiable risk factors Use few matching factors

MATCHING
Overzealous matching may have adverse effects: Matching on a strong correlate of the exposure, which is not an independent risk factor for the outcome (overmatching) may lead to an underestimate of OR Matching may lead to a false sense of security that a particular variable is adequately controlled

1.

Control selection is usually through matching. Matching variables (e.g. age), and matching criteria (e.g. within the same 5 year age group) must be set up in advance.

2. Controls can be individually matched (most


common) or Frequency matched. Individual matching: search for one (or more) controls who have the required matching criteria, paired (triplet) matching is when there is one (two) control (s) individually matched to each cases. Frequency matching: select a population of controls such that the overall characteristics of the case, e.g. if 15% cases are under age 20, 15% of the controls are also

3. Avoid over-matching, match only on factors KNOWN to be cause of the disease. 4. Obtain POWER by matching MORE THAN ONE CONTROL per case. In general, N of

controls should be < 4, because there is


no further gain of power above that. 5. Obtain Generalizability by matching by matching more than one type of control.

Various soft wares are available

Questionnaires Records Conversion tables/algorithms

Questionnaire

Quality of exposure reports may be influenced by


Question comprehension Information retrieval Response formulation and recording

Type of respondent Administration of questionnaire Salience of exposure Way in which information is retrieved Ways in which responses are formulated and recorded

Records
Abstraction of data from record Quality control measures are important Careful design and testing of abstraction form Training and supervision of abstractors Priori definition of terms Specifications of rules for handling conflicting or missing data

FIRST:

Select
CASES (With Disease) CONTROLS (Without Disease)

THEN: Measure Exposure

Were exposed Were not exposed TOTALS

a c a+c

b d b+d

Proportions Exposed

a a+c

b b+d

Odds Ratio =

a c b d a a+b c c+d
E+

ad bc
Case a Control b

Risk

E-

Exposed Controls

Case Exposed Unexposed Both Mixed

Unexposed Mixed

Neither

For one control


Case Exposed Exposed Controls Unexposed

Unexposed

McNemar 2=(t+s)2/(t-s)

Stroke

Control

total

Hypertension No hypertension Total

30 70 100

10 90 100

40 160 200

Odds ratio =

30x90 = 3.86 10x70

Case Hypertension No hypertension


Hypertension 2 Control No hypertension 28 Total 30 62 70 8

Total
10

90 100

McNemar 2=(t+s)2/(t-s) =(28+8)2/(28-8)= 34.61

1. Only realistic study design for uncovering etiology in rare diseases 2. Important in understanding new diseases 3. Commonly used in outbreaks investigation 4. Useful if inducing period is long 5. Relatively inexpensive

Advantages:

1. Susceptible to bias if not carefully designed 2. Especially susceptible to exposure misclassification 3. Especially susceptible to recall bias 4. Restricted to single outcome 5. Incidence rates not usually calculate 6. Cannot assess effects of matching variables

Dolls 1952 study of smoking and lung cancer. The problem was that the control population ( lung disease) was biased in relation to the exposure. McMahons 1981 study of coffee and pancreatic cancer. Problem was that some of the controls may have been biased in relation to the exposure, because diseases related to coffee were excluded from the control series.

1950s Cigarette smoking and lung cancer


1970s Diethyl stilbestrol and vaginal adenocarcinoma Post-menopausal estrogens and endometrial cancer

Famous Examples and discoveries


1980 s Aspirin and Reyes sydrome Tampon use and toxic shocks syndrome L-tryptopham and eosinophilia-myalgia syndrome AIDS and sexual practices 1990s Vaccine effectiveness Diet and cancer

The odds ratio is a good estimate of the relative risk when the disease is rare (prevalence <20%) Can be extended to N>1 controls Statistical testing is by simple chi-square (unmatched analysis) or by McNemars chi- square (matched-pairs analysis) Can be extended to multiple strata ( Mantel-Haenzel chi-square)

Case-control studies should be viewed as efficient sampling schemes of the disease experience of the underlying open or closed cohorts The exposure odds ratio derived from case-control studies equals/closely matches the relative risk derived from cohort studies

Thank you

You might also like