Polygenic Risk Scores Derived From Varying Definitions
Polygenic Risk Scores Derived From Varying Definitions
Polygenic Risk Scores Derived From Varying Definitions
Supplemental content
IMPORTANCE Genetic studies with broad definitions of depression may not capture genetic
risk specific to major depressive disorder (MDD), raising questions about how depression
should be operationalized in future genetic studies.
OBJECTIVE To use a large, well-phenotyped single study of MDD to investigate how different
definitions of depression used in genetic studies are associated with estimation of MDD and
phenotypes of MDD, using polygenic risk scores (PRSs).
DESIGN, SETTING, AND PARTICIPANTS In this case-control polygenic risk score analysis,
patients meeting diagnostic criteria for a diagnosis of MDD were drawn from the Australian
Genetics of Depression Study, a cross-sectional, population-based study of depression, and
controls and patients with self-reported depression were drawn from QSkin, a
population-based cohort study. Data analyzed herein were collected before September 2018,
and data analysis was conducted from September 10, 2020, to January 27, 2021.
MAIN OUTCOME AND MEASURES Polygenic risk scores generated from genome-wide
association studies using different definitions of depression were evaluated for estimation of
MDD in and within individuals with MDD for an association with age at onset, adverse
childhood experiences, comorbid psychiatric and somatic disorders, and current physical and
mental health.
RESULTS Participants included 12 106 (71% female; mean age, 42.3 years; range, 18-88 years)
patients meeting criteria for MDD and 12 621 (55% female; mean age, 60.9 years; range,
43-87 years) control participants with no history of psychiatric disorders. The effect size of
the PRS was proportional to the discovery sample size, with the largest study having the
largest effect size with the odds ratio for MDD (1.75; 95% CI, 1.73-1.77) per SD of PRS and the
PRS derived from ICD-10 codes documented in hospitalization records in a population health
cohort having the lowest odds ratio (1.14; 95% CI, 1.12-1.16). When accounting for differences
in sample size, the PRS from a genome-wide association study of patients meeting diagnostic
criteria for MDD and control participants was the best estimator of MDD, but not in those
with self-reported depression, and associations with higher odds ratios with childhood
adverse experiences and measures of somatic distress.
CONCLUSIONS AND RELEVANCE These findings suggest that increasing sample sizes,
regardless of the depth of phenotyping, may be most informative for estimating risk of
depression. The next generation of genome-wide association studies should, like the
Australian Genetics of Depression Study, have both large sample sizes and extensive
phenotyping to capture genetic risk factors for MDD not identified by other definitions of
depression.
(Reprinted) E1
© 2021 American Medical Association. All rights reserved.
D
epression is a common, often recurrent or severe, psy-
chiatric disorder and one of the leading causes of global Key Points
disability.1 Depression is characterized by significant
Question To what extent does the depth of phenotyping matter
heterogeneity in timing of onset, symptom profile, course, re- in genetic studies of depression?
sponse to treatment, and both psychiatric and physical
Findings In this case-control polygenic risk score analysis
comorbidities.2 Approximately 30% to 40% of the total vari-
including 12 106 individuals with major depressive disorder, the
ance in liability to major depressive disorder (MDD) is attrib-
major factor in estimating risk was sample size of the discovery
utable to additive genetic factors.3 Since 2015, there have been genome-wide association studies. Polygenic risk scores derived
a number of breakthroughs in identifying genetic risk factors from studies assessing diagnostic criteria for major depressive
for depression.4-8 In 2019, the Psychiatric Genomics Consor- disorder had associations with higher odds ratios with somatic
tium (PGC) identified 102 independent variants associated with symptoms and comorbidities of major depressive disorder.
depression. In 2021, a meta-analysis including the Million Vet- Meaning Results of this study suggest that to generate potential
erans Project identified 233 associated variants.9 To achieve better genetic estimations of risk for severe depression, larger
the extensive sample sizes needed to identify these loci, a large genome-wide association study sample sizes, regardless of the
proportion of cases were defined based on (1) responses to a depth of phenotyping, should be prioritized.
single screening question regarding seeking professional help
for depression, worries, or tension; (2) a self-reported diagno-
sis of depression during a nurse-led interview in the UK Bio- naires were approved by the QIMR Berghofer Medical Re-
bank; (3) online assessment in 23andMe; or (4) a diagnosis from search Institute Human Research Ethics Committee. Data
electronic health records (collectively referred to as minimal analysis for this study was conducted from September 10,
phenotyping). Thus, many of the individuals were either not 2020, to January 27, 2021. This study followed the Strength-
assessed for or did not meet the criteria for MDD as defined ening the Reporting of Observational Studies in Epidemiol-
by the DSM-5.10 ogy (STROBE) reporting guideline.
Cai and colleagues11 found evidence for differences in ge-
netic architecture between depression defined using mini- The Australian Genetics of Depression Study
mal phenotyping and MDD assessed using a diagnostic ques- The AGDS is a large ongoing study of the causes of depression
tionnaire, including a higher heritability and lack of enrichment and treatment response. The recruitment and sample charac-
of association in genes expressed in the brain for clinically de- teristics of the AGDS have been described in detail elsewhere.5
fined depression and nonspecificity of loci identified using This present study uses data from the first data freeze in Sep-
minimal phenotyping. Including minimally phenotyped pa- tember 2018. Between 2016 and 2018, 20 689 participants
tients and controls thus substantially boosts power to detect (age, 28-58 years; 75% women) provided online consent and
genetic loci, but may increase heterogeneity within and across enrolled in the study. Participants completed a compulsory
cohorts and so miss clinically important genetic effects spe- module that included the Composite International Diagnos-
cific to MDD. tic Interview Short Form13 to assess diagnostic criteria for
The proliferation of large, population-based health stud- depression. The compulsory module also assessed psychiat-
ies with genomic information and the increasing availability ric comorbidities. Before September 2018, a total of 15 792 par-
of administrative health data with diagnostic codes for de- ticipants had provided a saliva sample (GeneFix; Isohelix
pression might facilitate valuable insights into the cause of de- saliva kit).
pression. However, the extent to which genetic findings from We evaluated the association between depression PRSs
depression defined by minimal phenotyping extend to clini- and a number of clinical features of depression in individuals
cal diagnoses of depression using diagnostic questionnaires or meeting DSM-5 criteria for MDD. These features included
interviews is a key issue that will inform the interpretation and early age at onset (defined as reported age at first episode of
design of future studies. depression <21 years), reporting more than 2 episodes of
Herein, we used the Australian Genetics of Depression Study depression, childhood trauma (defined as having experienced
(AGDS), a large online study of the genetic cause of depression,12 sexual, physical, or emotional abuse before age 18 years), and
to investigate how polygenic risk scores (PRSs) constructed from a self-reported diagnosis of an anxiety disorder, bipolar disor-
different definitions of depression and meta-analyses encom- der, migraine, chronic fatigue, or chronic pain. Furthermore,
passing multiple definitions map to specific features of clinical we investigated the self-reported current measures of psycho-
depression, such as age at onset, severity, reported trauma, and logical distress and somatic symptoms determined using the
psychiatric and physical comorbidities. The large sample size and PSYCH and SOMA subscales, respectively, of the SPHERE-12.14
breadth of phenotyping make this a unique cohort for dissect- The sample sizes for each of the phenotypes are shown in
ing the genetic architecture of depression. eTable 1 in the Supplement.
QSkin Study
The QSkin sun and health study is a prospective cohort study
Methods initiated in 2011 primarily to examine skin cancer outcomes.
A schematic overview of the design of the study is shown in Participants aged 40 to 70 years responded to a mailing to resi-
eFigure 1 in the Supplement. All protocols and question- dents of Queensland, Australia, selected at random from the
electoral role (n = 43 794). A total of 17 218 QSkin participants These definitions were based on measures including re-
provided a saliva sample in 2014; answered the lifestyle ques- sponses to single questions regarding help seeking, depres-
tionnaire, which included a disease checklist comprising ques- sion diagnoses obtained from linked health records, and MDD
tions about ever having been diagnosed with psychiatric dis- defined using DSM-5 criteria. Because different definitions pro-
orders; and provided consent for their data to be used for future duce widely varying numbers of cases and controls, which will
research. Participants of European ancestry who reported not affect power, we further evaluated the performance of PRSs
having been given a diagnosis of any psychiatric disorder were derived from the 6 definitions of depression in the UK Bio-
selected as controls for the case-control analysis. Those who bank when each definition is downsized to give equal num-
reported a diagnosis of depression were included in the case bers of cases and controls between definitions (7500 cases and
cohort. 42 500 controls) using the summary statistics provided by Cai
et al.11 Because the PGC 2019, broadly defined depression in
Depression Phenotypes Used to Generate PRSs the UK Biobank, and PGC29 studies include depression diag-
We evaluated the association of PRSs from summary statis- noses defined in multiple ways rather than a single strict defi-
tics derived from 9 different genome-wide association stud- nition, downsampling was not performed.
ies (GWASs) of depression (Table).4,6,11,15 First, we used the re-
sults of the most recent published analysis of the Psychiatric Polygenic Risk Scores
Genomics Consortium Major Depression Working Group (PGC Details of the genotyping and quality control are provided in
2019),6 to our knowledge, the largest published study of sum- the eMethods in the Supplement. SBayesR, 16 a bayesian
mary statistics available, with the Australian samples (listed method that assumes that single-nucleotide variant (SNV) ef-
as the QIMR cohort) removed to ensure there was no chance fects are drawn from a mixture of four 0-mean normal distri-
of sample overlap. The PGC 2019 study is a meta-analysis in- butions with different variances, was used to generate the
cluding clinical cohorts, population registers, data from weights for the PRSs. This method rescales the GWAS SNV ef-
23andMe, and broadly defined depression in the UK Biobank. fects with many SNVs assumed to have an effect size of 0. Full
23andMe participants provided informed consent and partici- details are provided in the eMethods in the Supplement. The
pated in the research online, under a protocol approved by the posterior SNV effects estimated by SBayesR were used to gen-
external Association for the Accreditation of Human Re- erate PRSs for each individual using the score function in
search Protection Programs–accredited institutional review PLINK.
board. Second, we used the published results from a GWAS of Polygenic risk scores were standardized to calculate the
broad depression in the UK Biobank that includes individuals effect size per SD unit of PRS. We also used linkage disequi-
with depression defined by answering yes to having sought librium score regression17 to calculate the SNV-based herita-
help for nerves, anxiety, tension, or depression or a diagnosis bility for clinical and self-reported depression and the
of depression using linked hospital records. Third, we used genetic correlation with depression phenotypes from the
summary statistics from the cohorts with clinically defined UK Biobank.
MDD in the PGC2019 study. These groups are described as the In addition to evaluating the association with clinical de-
PGC29 cohorts by Wray et al.4 The summary statistics do not pression in AGDS and self-reported depression in QSkin, we
include the QIMR cohorts, but for consistency with previous examined the association between depression PRSs and a num-
studies, we refer to this discovery sample as PGC29. The re- ber of clinical features of depression in individuals meeting
maining 6 phenotypes and their corresponding downsampled MDD criteria in the AGDS. These features included early age
results are from the study of Cai et al,11 who conducted GWASs at onset (defined as reported age at first episode of depres-
using 6 different definitions of depression in the UK Biobank. sion <21 years), reporting more than 2 episodes of depres-
sion, childhood trauma (defined as having experienced sexual, higher (genetic correlation, 0.92; SE, 0.11), compared with when
physical, or emotional abuse before age 18, assessed using including self-reported cases (genetic correlation, 0.78; SE,
part A of the PTSD Checklist for DSM-518), and a self-reported 0.25), although this difference was not statistically signifi-
diagnosis of an anxiety disorder, bipolar disorder, migraine, cant (eTable 2 in the Supplement). Similarly, despite a larger
chronic fatigue, or chronic pain. Furthermore, we investi- sample size than the downsampled GWASs, the PRS derived
gated the self-report current measures of psychological dis- from individuals with clinically defined MDD in PGC29 was not
tress and somatic symptoms measured using the PSYCH and significantly more associated with self-reported depression in
SOMA subscales of the SPHERE-12.14 The sample sizes for QSkin (Figure 1D). To investigate whether selecting screening
each of the phenotypes are reported in eTable 1 in the for all psychiatric disorders in controls affected the results, we
Supplement. repeated the analysis with controls who reported not being
Each of the full and downsampled PRSs was regressed diagnosed with depression only (n = 13 696). The increased as-
against the clinical phenotypes of interest using logistic sociation of the MDD-PRS in the individuals meeting MDD
regression for binary variables and linear regression for con- criteria remained (eFigure 2 in the Supplement).
tinuous variables. Continuous variables were standardized We further sought to evaluate whether there are other clini-
before the regression. All analyses included age at enrollment, cal features of depression that are better captured by the clini-
sex, and 10 genetic principal components as covariates. cally defined PRS. The results are shown in eFigure 3 and
eTable 4 in the Supplement. Across all clinical measures exam-
ined, the PRSs from the largest PGC meta-analysis had the larg-
est effect size. Likewise, when considering the different defi-
Results nitions of depression in the UK Biobank, the broad definition
A total of 12 106 (75% female; mean age, 42.3 years; range, 18-88 that encompasses multiple definitions and self-reports of see-
years) participants of European ancestry who met DSM-5 cri- ing a physician for nerves, anxiety, tension, or worry, which has
teria for MDD were included. A further set of individuals (3083; the largest sample size, generally gives the best estimation. By
68% female) who self-reported a diagnosis of depression but contrast, there are a number of notable features of the lifetime
for whom diagnostic criteria were not assessed was drawn from MDD PRSs. First, consistent with it better capturing the ge-
the QSkin study.19 Participants of European ancestry from netic risk for depression that is not shared with other major psy-
QSkin who reported not having a diagnosis of any psychiatric chiatric disorders, the lifetime MDD PRS was not significantly
disorder were included as controls (12 621; 51% female; mean higher in those reporting a comorbid anxiety disorder (OR, 1.02;
age, 60.9 years; range, 43-87 years). The SNV-based heritabil- 95% CI, 0.98-1.06; P = .31) or comorbid bipolar (OR, 1.01; 95%
ity on the liability scale when comparing individuals with MDD CI, 0.95-1.09; P = .80). In comparison, multiple other defini-
in the AGDS with controls was 0.16 (0.02) and comparing QSkin tions, including both the ICD-10 codes from electronic rec-
participants with self-reported depression with controls was ords, and self-reports of seeing a physician for nerves, anxiety,
0.12 (0.06) (eTable 2 in the Supplement). tension, or worry in the UK Biobank, PRSs were significantly
We evaluated the association of each PRS with case sta- increased in those with comorbidities. Second, when account-
tus in the AGDS and QSkin. Regardless of whether the target ing for differences in sample size, the lifetime MDD PRS had as-
sample included participants assessed for lifetime MDD sociations with higher ORs with reporting childhood trauma (OR,
(Figure 1A; eTable 3 in the Supplement) or a self-report diag- 1.14; 95% CI, 1.09-1.19) vs the association with the next highest
nosis of depression (Figure 1C), the larger the sample size of OR, PRS (OR, 1.07; 95% CI, 1.02-1.12) from other definitions.
the GWAS discovery, the larger the effect size of the PRS in the Third, when accounting for sample size, the lifetime MDD and
target sample, with the largest study (PGC2019) having the larg- ICD-10 PRSs are better than other definitions at estimating cur-
est effect size with the odds ratio for MDD (1.75; 95% CI, 1.73- rent levels of somatic distress (eFigure 3 and eTable 4 in the
1.77) per SD of PRS and the PRS derived from International Sta- Supplement).
tistical Classification of Diseases, 10th Edition (ICD-10) codes Given the high prevalence of somatic symptoms re-
documented in hospitalization records in a population health ported by patients with more severe depressive disorders,20
cohort having the lowest odds ratio (1.14; 95% CI, 1.12-1.16). For we hypothesized that genetic analyses based on clinical defi-
all PRSs, the effect size was larger in the individuals with life- nitions of depression better capture risk of somatic symp-
time MDD, indicating that patients meeting clinical criteria in toms of depression than do definitions based on a single ques-
AGDS have a higher mean depression PRS than those who re- tion or multiple screening questions, particularly when that
port having a depression diagnosis in the QSkin community question focuses on mood or psychological distress alone. We
sample. Given equal sample sizes, the lifetime MDD PRS had next investigated the association between the PRSs and cur-
associations with higher ORs with lifetime MDD (OR, 1.20; 95% rent levels of mental and physical health measured on a scale
CI, 1.16-1.24) than the other definitions, such as PsyPsy (OR, from 1 (very poor) to 5 (excellent). Both the lifetime MDD PRS
1.12; 95% CI, 1.08-1.15) (Figure 1C; eTable 3 in the Supple- (β = −0.01 [0.009]; P = .29) and PGC29 PRS (β = −0.004 [0.01];
ment). This association was not found when evaluating self- P = .67) showed no evidence of association with current men-
reported depression in QSkin, in which diagnostic criteria were tal health but show evidence of association with physical health
not assessed (Figure 1D). Consistent with these results, the es- (Lifetime MDD, β = −0.041 [0.009]; P = 6.05 × 10−06; PGC29,
timated genetic correlation with lifetime MDD in the UK Bio- β = −0.023 [0.009]; P = .01). When considering equal sample
bank when including patients with clinically defined MDD was sizes, the lifetime MDD PRS has the highest effect size with
Figure 1. Association of Depression Polygenic Risk Scores (PRSs) With Clinically Defined Major Depressive Disorder (MDD)
and Self-reported Depression
Discovery GWAS
ICD-10 DepAll GPPsy PGC29 PGC 2019
SelfRepDep PsyPsy Lifetime MDD UKB Broad
SelfRepDep
DepAll
Discovery GWAS
PsyPsy
GPPsy
Lifetime MDD
PGC29
UKB Broad
PG C2019
1.0 1.2 1.4 1.6 1.8 1.0 1.2 1.4 1.6 1.8
OR per SD increase in profile score OR per SD increase in profile score
SelfRepDep
DepAll
Discovery GWAS
PsyPsy
GPPsy
Lifetime MDD
PGC29
UKB Broad
PG C2019
1.0 1.2 1.4 1.6 1.8 1.0 1.2 1.4 1.6 1.8
OR per SD increase in profile score OR per SD increase in profile score
Results from estimating depression in target samples using PRSs from different GPPsy, self-report of seeing a general practitioner for nerves, anxiety/tension,
depression genome-wide association study (GWAS) discovery samples. Full worry in the UK Biobank (113 262 cases and 219.360 controls);
indicates the total sample for each discovery GWAS, and downsampled each ICD-10, International Statistical Classification of Diseases, 10th Edition code for
discovery GWAS downsampled to 7500 patients and 12 500 controls. For depression from linked electronic health records in UK Biobank (9176 cases,
illustrative purposes, the PGC29 PRS effect size is plotted in both the full and 203 235 controls); Lifetime MDD, patients meeting DSM-5 criteria for MDD in
downsampled panels, but this GWAS was not downsampled. Estimation of the UK Biobank and controls that screened negative for MDD (16 301 patients
depression in the AGDS full (A) and downsampled (B) cohorts and the QSkin full and 50 870 controls); PGC29, meta-analysis of cohorts from PGC-MDD study
(C) and downsampled (D) cohorts. The PGC 2019 PRS, which has the largest with clinical diagnoses from interviews or from clinicians (14 833 cases and
sample size, was the best estimator of depression in both cases assessed using 23 921 controls); PGC 2019, largest published GWAS of depression published to
DSM-5 criteria (A) and those assessed using a single self-report item (C). When date (includes 246 819 clinically defined and minimally phenotyped patients
sample sizes were equal, the lifetime MDD PRS was a better estimator of case and 561 485 controls); PsyPsy, self-report of seeing a psychiatrist for nerves,
status in those meeting DSM-5 criteria (B) but not in those assessed using anxiety, tension, or worry in the UK Biobank (36 286 patients, 297 126
minimal phenotyping (D) Circles indicate the odds ratio per SD in profile score controls); SelfRepDep, self-report of history of depression in interview with
with lines showing the 95% CIs. DepAll indicates self-report of seeing a general trained nurses in the UK Biobank (19 805 cases, 234 114 controls), and UKB
practitioner for nerves, anxiety, tension, or worry and at least 2 weeks of Broad, self-report of seeing a general practitioner or psychiatrist in the UK
depression or anhedonia in the UK Biobank (21 777 cases and 58 396 controls); Biobank (113 769 cases, 208 811 controls).
physical health (β = −0.035 [0.009]; P = 7.6 × 10−05) than other had associations with higher ORs with migraine (OR, 1.08; 95%
definitions, with the next largest being the ICD-10–based PRS CI, 1.04-1.12; P = 4.76 × 10−05) than the association with the
(β = −0.026 [0.009]; P = .003) (eFigure 4 and eTable 5 in the next highest ORs, PRS (GPPsy; OR, 1.02; 95% CI, 0.98-1.07;
Supplement). P = .36). Similarly, the lifetime MDD PRS had associations with
In addition, we investigated which PRSs are associated with higher ORs with chronic fatigue syndrome (OR, 1.13; 95% CI,
reporting common physical comorbidities of depression. When 1.05-1.22; P = 7.11 × 10−04) with the next most associated PRS
discovery GWAS sample sizes are equal, the lifetime MDD PRS derived from ICD-10 codes (OR, 1.04; 95% CI, 0.97-1.11; P = .34).
The lifetime MDD PRS was also associated with chronic pain lation is at risk of having MDD, then increasing sample size
(OR, 1.07; 95% CI, 1.01-1.13; P = .02); however, the results from of the discovery sample for GWASs of depression, regardless
other PRSs were comparable (ICD-10 PRS, OR, 1.06; 95% CI, of the depth of phenotyping, should remain a high priority.
1.00-1.12; P = .04). (Figure 2; eTable 6 in the Supplement). This If we seek to understand more completely the neurobiologi-
pattern of results suggests that selecting individuals with de- cal underpinnings of more clinical forms of MDD, then as
pression and controls by screening for diagnostic criteria for postulated by Cai and colleagues,11 minimal phenotyping
MDD gives a genetic risk score with associations with higher will not capture all of the genetic risk for depression. How-
ORs with physiologic perturbations and phenotypes charac- ever, even if studies that do not assess diagnostic criteria for
terized by somatic symptoms, than other definitions of de- MDD capture genetic risk that is nonspecific, the identified
pression. However, the PGC29 PRS, which has only clinically genetic risk factors contribute to the severity of depression
defined cases, was associated only with comorbid migraine as measured by earlier age at onset and chronicity and is
(OR, 1.06; 95% CI, 1.02-1.11; P = .005). therefore of major importance to elucidating the cause of
depression.
Another key implication is investigating gene by environ-
ment interactions with childhood trauma using PRSs. Al-
Discussion though the PRSs from all of the definitions of depression are
We evaluated the association of PRSs generated from differ- enriched in individuals reporting trauma (eFigure 3 in the
ent discovery samples of depression with depression in indi- Supplement), the lifetime MDD has the largest effect size. Thus,
viduals meeting clinical criteria and self-reported depres- the phenotype definition in both the discovery and target
sion. We found that estimation in the target samples was samples may affect the results of PRSs by trauma analyses, and
proportional to the sample size of the discovery GWAS, de- screening for diagnostic criteria in patients and controls will
spite the larger GWASs depending on minimal phenotyped be informative for untangling the association between ge-
cases. Consistent with the findings of Cai et al,11 we found that netic and environmental risks for depression. The increased
when sample sizes of the discovery GWAS were equal, the clini- PRSs in individuals reporting trauma is consistent with the find-
cal MDD PRS appeared to be a better variable associated with ings of Coleman et al,24 who showed that the SNV-based heri-
patients with MDD, but not in patients who self-report a diag- tability was higher in patients with MDD who reported child-
nosis of depression without being assessed using the MDD cri- hood trauma in the UK Biobank. The association with high ORs
teria. This finding supports the conjecture of Cai et al that with the MDD PRS may reflect the phenotypic association of
GWASs including only patients and controls screened for di- trauma with MDD compared with other definitions as de-
agnostic criteria capture a genetic component of risk specific scribed by Cai et al.11 The phenotypic association of trauma with
to MDD. MDD could induce gene-environment associations influ-
Analyses of clinical phenotypes of MDD showed that when enced by differences in socioeconomic status,25 which would
sample sizes are equal, the lifetime MDD PRS is also associ- manifest in the discovery GWASs as genetic effects.25,26 Within-
ated with poor physical health, higher rates of somatic symp- family analyses will be valuable for further investigating dif-
toms, and having comorbid migraine or chronic fatigue syn- ferences in polygenic risk between patients with and without
drome. A similar pattern was seen for the PRSs generated using exposure to trauma.27
ICD-10 codes for depression from electronic health records, al-
though not as stark as for the MDD PRS (eFigure 3 in the Supple- Limitations
ment). In contrast, PRSs derived from analyses using mini- The study had limitations. Both the UK Biobank discovery
mal phenotyping are not significantly less useful at estimating sample and the AGDS rely on structured diagnostic question-
measures of severity, such as age at onset and number of epi- naires to assess criteria for MDD. Although these instruments
sodes. The PGC29 PRS also showed evidence of association have been found to have good validity, an interview with a
with somatic symptoms, but with lower effect sizes than the trained clinician remains the standard in diagnosing MDD, and
lifetime MDD PRS. Although patients in the PGC29 discovery the results of this study should be viewed with caution. Like-
GWAS met clinical criteria, there were differences in the as- wise, the UK Biobank, AGDS, and QSkin are cohort studies that
certainment of patients across the cohorts in the PGC studies have not recruited participants in the clinical setting. They
and, perhaps more importantly, differences in the screening therefore may not be representative of the full clinical spec-
of controls, with some cohorts using unscreened controls, trum of MDD in the population. In addition, participants in the
which may have affected the results. discovery and target studies are mostly of British and Irish an-
Somatic symptoms are common in patients with MDD and cestries, and these results may not generalize to other ances-
include fatigue, headaches, and back pain,21 and previous stud- tral groups both within and outside Europe.
ies have found that a large proportion of patients meeting the
criteria for depression present initially to primary care clini-
cians with somatic symptoms.22 Painful somatic symptoms are
associated with increased functional impairment20 and poorer
Conclusions
outcomes in patients with depression.21,23 Results of this case-control study suggest that increasing
Our results have important implications. If genetic sample sizes by including patients defined in numerous ways
information will have utility in estimating who in the popu- is essential to enhancing our understanding of genetic risk for
Figure 2. Association Between Depression Polygenic Risk Scores (PRSs) and Self-reported Diagnosis of Physical Comorbidities
in Individuals With Major Depressive Disorder (MDD)
Discovery GWAS
ICD-10 DepAll GPPsy PGC29 PGC 2019
SelfRepDep PsyPsy Lifetime MDD UKB Broad
Migraine Migraine
Full Downsampled
ICD-10
SelfRepDep
DepAll
PsyPsy
Trait
GPPsy
Lifetime MDD
PGC29
UKB Broad
PGC 2019
–0.1 0 0.1 0.2 –0.1 0 0.1 0.2
β (SE) β (SE)
SelfRepDep
DepAll
PsyPsy
Trait
GPPsy
Lifetime MDD
PGC29
UKB Broad
PGC 2019
–0.1 0 0.1 0.2 –0.1 0 0.1 0.2
β (SE) β (SE)
SelfRepDep
DepAll
PsyPsy
Trait
GPPsy
Lifetime MDD
PGC29
UKB Broad
PGC 2019
–0.1 0 0.1 0.2 –0.1 0 0.1 0.2
β (SE) β (SE)
Results from estimating comorbid physical disorders in patients with MDD in MDD, patients meeting DSM-5 criteria for MDD in the UK Biobank and controls
the Australian Genetics of Depression Study. Full indicates the total sample for that screened negative for MDD (16 301 patients and 50 870 controls);
each discovery genome-wide association study (GWAS). When sample sizes are PGC29, meta-analysis of cohorts from PGC-MDD study with clinical diagnoses
equal, the lifetime MDD PRS was a better estimator of comorbid migraine from interviews or from clinicians (14 833 cases and 23 921 controls);
(A) and chronic fatigue syndrome (B) and had the largest effect size for chronic PGC 2019, largest published GWAS of depression published to date (includes
pain (C). DepAll indicates self-report of seeing a general practitioner for nerves, 246 819 clinically defined and minimally phenotyped patients and 561 485
anxiety, tension, or worry and at least 2 weeks of depression or anhedonia in controls); PsyPsy, self-report of seeing a psychiatrist for nerves, anxiety,
the UK Biobank (21 777 cases and 58 396 controls); GPPsy, self-report of seeing tension, or worry in the UK Biobank (36 286 patients, 297 126 controls);
a general practitioner for nerves, anxiety/tension, worry in the UK Biobank SelfRepDep, self-report of history of depression in interview with trained nurses
(113 262 cases and 219.360 controls); ICD-10, International Statistical in the UK Biobank (19 805 cases, 234 114 controls), and UKB Broad, self-report
Classification of Diseases, 10th Edition code for depression from linked of seeing a general practitioner or psychiatrist in the UK Biobank (113 769 cases,
electronic health records in UK Biobank (9176 cases, 203 235 controls); Lifetime 208 811 controls).
depression and generating more accurate PRSs for use in re- are needed. The AGDS demonstrates that it is feasible to es-
search and clinical settings. However, to see a complete pic- tablish large genetically informative cohorts with in-depth on-
ture of the biological characteristics of depression, large, well- line phenotyping that can provide meaningful insights into the
phenotyped cohorts that are enriched for clinical depression cause of depression.
ARTICLE INFORMATION AstraZeneca) projects focused on the identification 3. Sullivan PF, Neale MC, Kendler KS. Genetic
Accepted for Publication: June 7, 2021. and better management of anxiety and depression. epidemiology of major depression: review and
He was a member of the Medical Advisory Panel for meta-analysis. Am J Psychiatry. 2000;157(10):1552-
Published Online: August 11, 2021. Medibank Private until October 2017, a board 1562. doi:10.1176/appi.ajp.157.10.1552
doi:10.1001/jamapsychiatry.2021.1988 member of Psychosis Australia Trust, and a member
4. Wray NR, Ripke S, Mattheisen M, et al; eQTLGen;
Author Affiliations: QIMR Berghofer Medical of Veterans Mental Health Clinical Reference group.
23andMe; Major Depressive Disorder Working
Research Institute, Brisbane, Australia (Mitchell, He is the chief scientific advisor to and a 5% equity
Group of the Psychiatric Genomics Consortium.
Thorp, Campos, Gordon, Whiteman, Olsen, Martin, shareholder in InnoWell Pty Ltd, which was formed
Genome-wide association analyses identify 44 risk
Medland); School of Biomedical Sciences, Faculty of by the University of Sydney (45% equity) and PwC
variants and refine the genetic architecture of
Health, Queensland University of Technology, (Australia; 45% equity) to deliver the $30 million
major depression. Nat Genet. 2018;50(5):668-681.
Brisbane, Australia (Mitchell, Nyholt); Faculty of Australian government-funded Project Synergy
doi:10.1038/s41588-018-0090-3
Medicine, The University of Queensland, Brisbane, (2017-2020, a 3-year program for the
Australia (Thorp, Campos); Institute for Molecular transformation of mental health services), and to 5. CONVERGE consortium. Sparse whole-genome
Bioscience, The University of Queensland, lead transformation of mental health services sequencing identifies two loci for major depressive
Brisbane, Australia (Wu, Wray, Byrne); School of internationally through the use of innovative disorder. Nature. 2015;523(7562):588-591. doi:10.
Biomedical Sciences, The University of Queensland, technologies. No other disclosures were reported. 1038/nature14659
Brisbane, Australia (Campos); Centre for Genomics Funding/Support: The Australian Genetics of 6. Howard DM, Adams MJ, Clarke TK, et al;
and Personalised Health, Queensland University of Depression Study was primarily funded by grant 23andMe Research Team; Major Depressive
Technology, Brisbane, Australia (Nyholt); Brain and 1086683 from the NHMRC of Australia. This work Disorder Working Group of the Psychiatric
Mind Centre, The University of Sydney, Sydney, was further supported by NHMRC grants 1145645, Genomics Consortium. Genome-wide
New South Wales, Australia (Hickie); Queensland 1078901, and 108788, and by National Institutes of meta-analysis of depression identifies 102
Brain Institute, The University of Queensland, Health grant 1R01MH121545-01. independent variants and highlights the
Brisbane, Australia (Wray); Child Health Research importance of the prefrontal brain regions. Nat
Centre, The University of Queensland, Brisbane, Role of the Funder/Sponsor: The funding
organizations had no role in the design and conduct Neurosci. 2019;22(3):343-352. doi:10.1038/
Australia (Byrne). s41593-018-0326-7
of the study; collection, management, analysis, and
Author Contributions: Dr Byrne had full access to interpretation of the data; preparation, review, or 7. Hyde CL, Nagle MW, Tian C, et al. Identification
all of the data in the study and takes responsibility approval of the manuscript; and decision to submit of 15 genetic loci associated with risk of major
for the integrity of the data and the accuracy of the the manuscript for publication. depression in individuals of European descent. Nat
data analysis.
Additional Contributions: We are indebted to all of Genet. 2016;48(9):1031-1036. doi:10.1038/ng.3623
Concept and design: Mitchell, Thorp, Nyholt, Hickie,
Martin, Medland, Wray, Byrne. the participants for giving their time to contribute 8. Levey DF, Stein MB, Wendt FR, et al GWAS of
Acquisition, analysis, or interpretation of data: to this study. We thank all the people who helped in depression phenotypes in the Million Veteran
Mitchell, Thorp, Wu, Campos, Nyholt, Gordon, the conception, implementation, beta testing, Program and meta-analysis in more than 1.2 million
Whiteman, Olsen, Martin, Medland, Wray, Byrne. media campaign, and data cleaning. We are grateful participants yields 178 independent risk loci. medRxiv.
Drafting of the manuscript: Mitchell, Byrne. to the Psychiatric Genomics Consortium Major 2020:2020.2005.2018.20100685. doi:10.1101/
Critical revision of the manuscript for important Depressive Disorder Working Committee for 2020.05.18.20100685
intellectual content: All authors. making summary statistics available for research.
We are grateful to all of the principal investigators, 9. Levey DF, Stein MB, Wendt FR, et al; 23andMe
Statistical analysis: Mitchell, Thorp, Wu, Campos, Research Team; Million Veteran Program.
Gordon, Medland, Wray, Byrne. researchers, and participants of the cohorts in the
Psychiatric Genomics Consortium. We also thank Bi-ancestral depression GWAS in the Million
Obtained funding: Whiteman, Olsen, Hickie, Martin, Veteran Program and meta-analysis in >1.2 million
Wray, Byrne. the research participants and employees of
23andMe for making this work possible. individuals highlight new therapeutic directions.
Administrative, technical, or material support: Nat Neurosci. 2021;24(7):954-963. doi:10.1038/
Mitchell, Campos, Nyholt, Whiteman, Olsen, Hickie, Additional Information: The full GWAS summary s41593-021-00860-2
Medland, Wray. statistics for the 23andMe discovery data set will be
Supervision: Martin, Wray. made available through 23andMe to qualified 10. American Psychiatric Association. Diagnostic
researchers under an agreement with 23andMe and Statistical Manual of Mental Disorders. 5th ed.
Conflict of Interest Disclosures: Dr Campos
that protects the privacy of the 23andMe American Psychiatric Association; 2013.
reported receiving grants from The University of
Queensland during the conduct of the study. participants. More information and application to 11. Cai N, Revez JA, Adams MJ, et al; MDD Working
Dr Whiteman reported receiving grants from the access the data are available at https://research. Group of the Psychiatric Genomics Consortium.
National Health and Medical Research Council 23andme.com/collaborate/#dataset-access/. Minimal phenotyping yields genome-wide
(NHMRC) of Australia Fellowship for salary and association signals of low specificity for major
competitive grants to support data collection and REFERENCES depression. Nat Genet. 2020;52(4):437-447.
analysis during the conduct of the study and 1. Diseases GBD, Injuries C; GBD 2019 Diseases and doi:10.1038/s41588-020-0594-5
speaker’s fees from Pierre Fabre for conference Injuries Collaborators. Global burden of 369 12. Byrne EM, Kirk KM, Medland SE, et al. Cohort
presentation outside the submitted work. Dr Olsen diseases and injuries in 204 countries and profile: the Australian Genetics of Depression
reported receiving grants from the NHMRC of territories, 1990-2019: a systematic analysis for the Study. BMJ Open. 2020;10(5):e032580. doi:10.
Australia during the conduct of the study. Dr Hickie Global Burden of Disease Study 2019. Lancet. 1136/bmjopen-2019-032580
was an inaugural commissioner on Australia's 2020;396(10258):1204-1222. doi:10.1016/
National Mental Health Commission (2012-2018) S0140-6736(20)30925-9 13. Kessler RC, Andrews G, Mroczek D, Ustun B,
and is the codirector, Health and Policy at the Brain Wittchen H-U. The World Health Organization
2. Lynch CJ, Gunning FM, Liston C. Causes and Composite International Diagnostic Interview
and Mind Centre (BMC) University of Sydney,
consequences of diagnostic heterogeneity in short-form (CIDI-SF). Int Methods PsychiatrRes .
Australia. The BMC operates an early-intervention
depression: paths to discovering novel biological 1998;7(4):171-185. doi:10.1002/mpr.47
youth services at Camperdown under contract to
depression subtypes. Biol Psychiatry. 2020;88(1):
headspace. Professor Dr Hickie had previously led 14. Hickie IB, Davenport TA, Hadzi-Pavlovic D, et al.
83-94. doi:10.1016/j.biopsych.2020.01.012
community-based and pharmaceutical Development of a simple screening tool for
industry-supported (Wyeth, Eli Lilly, Servier, Pfizer, common mental disorders in general practice. Med
J Aust. 2001;175(S1)(suppl):S10-S17. doi:10.5694/ 19. Olsen CM, Green AC, Neale RE, et al; QSkin somatic symptoms. J Psychosom Res. 2006;60(3):
j.1326-5377.2001.tb143784.x Study. Cohort profile: the QSkin Sun and Health 279-282. doi:10.1016/j.jpsychores.2005.09.010
Study. Int J Epidemiol. 2012;41(4):929-929i.
15. Howard DM, Adams MJ, Shirali M, et al; 24. Coleman JRI, Peyrot WJ, Purves KL, et al; on
doi:10.1093/ije/dys107
23andMe Research Team. Genome-wide the behalf of Major Depressive Disorder Working
association study of depression phenotypes in UK 20. Fritzsche K, Sandholzer H, Brucks U, et al. Group of the Psychiatric Genomics Consortium.
Biobank identifies variants in excitatory synaptic Psychosocial care by general practitioners—where Genome-wide gene-environment analyses of major
pathways. Nat Commun. 2018;9(1):1470. doi:10. are the problems? results of a demonstration depressive disorder and reported lifetime traumatic
1038/s41467-018-03819-3 project on quality management in psychosocial experiences in UK Biobank. Mol Psychiatry. 2020;
primary care. Int J Psychiatry Med. 1999;29(4):395- 25(7):1430-1446. doi:10.1038/s41380-019-0546-6
16. Lloyd-Jones LR, Zeng J, Sidorenko J, et al.
409. doi:10.2190/MCGF-CLD4-0FRE-N2UK
Improved polygenic prediction by bayesian multiple 25. Marees AT, Smit DJA, Abdellaoui A, et al.
regression on summary statistics. Nat Commun. 21. Vaccarino AL, Sills TL, Evans KR, Kalali AH. Genetic correlates of socio-economic status
2019;10(1):5086. doi:10.1038/s41467-019-12653-0 Prevalence and association of somatic symptoms in influence the pattern of shared heritability across
patients with major depressive disorder. J Affect mental health traits. Nat Hum Behav. 2021.
17. Bulik-Sullivan BK, Loh PR, Finucane HK, et al;
Disord. 2008;110(3):270-276. doi:10.1016/ doi:10.1038/s41562-021-01053-4
Schizophrenia Working Group of the Psychiatric
j.jad.2008.01.009
Genomics Consortium. LD score regression 26. Abdellaoui A, Hugh-Jones D, Yengo L, et al.
distinguishes confounding from polygenicity in 22. Simon GE, VonKorff M, Piccinelli M, Fullerton C, Genetic correlates of social stratification in Great
genome-wide association studies. Nat Genet. 2015; Ormel J. An international study of the relation Britain. Nat Hum Behav. 2019;3(12):1332-1342.
47(3):291-295. doi:10.1038/ng.3211 between somatic symptoms and depression. N Engl doi:10.1038/s41562-019-0757-5
J Med. 1999;341(18):1329-1335. doi:10.1056/
18. US Dept of Veterans Affairs. PTSD: National 27. Howe LJ, Nivard MG, Morris TT, et al
NEJM199910283411801
Centers for PTSD. Accessed July 12, 2021. Within-sibship GWAS improve estimates of direct
www.ptsd.va.gov 23. McIntyre RS, Konarski JZ, Mancini DA, et al. genetic effects. bioRxiv. 2021:
Improving outcomes in depression: a focus on 2021.2003.2005.433935.