Symptoms and Risk Factors For Long COVID in Non-Hospitalized Adults

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Articles

https://doi.org/10.1038/s41591-022-01909-w

Symptoms and risk factors for long COVID in


non-hospitalized adults
Anuradhaa Subramanian1, Krishnarajah Nirantharakumar 1,2,3 ✉, Sarah Hughes 1,4,5,6,7, Puja Myles8,
Tim Williams8, Krishna M. Gokhale1, Tom Taverner1, Joht Singh Chandan 1, Kirsty Brown 1,9,
Nikita Simms-Williams1, Anoop D. Shah10, Megha Singh1, Farah Kidy1,11, Kelvin Okoth1, Richard Hotham1,
Nasir Bashir 12, Neil Cockburn1, Siang Ing Lee1, Grace M. Turner1,4,13, Georgios V. Gkoutos 2,3,14,15,16,
Olalekan Lee Aiyegbusi 1,4,5,6,7,15, Christel McMullan1,4,7,13,17, Alastair K. Denniston 2,3,4,6,15,16,
Elizabeth Sapey16,18,19, Janet M. Lord13,15,18,20, David C. Wraith 15,21, Edward Leggett8, Clare Iles8,
Tom Marshall1, Malcolm J. Price1,15, Steven Marwaha22,23, Elin Haf Davies24, Louise J. Jackson 1,
Karen L. Matthews25, Jenny Camaradou25, Melanie Calvert 1,2,3,4,5,6,7,13,15,19 and Shamil Haroon 1

Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infection is associated with a range of persistent symptoms
impacting everyday functioning, known as post-COVID-19 condition or long COVID. We undertook a retrospective matched
cohort study using a UK-based primary care database, Clinical Practice Research Datalink Aurum, to determine symptoms
that are associated with confirmed SARS-CoV-2 infection beyond 12 weeks in non-hospitalized adults and the risk factors
associated with developing persistent symptoms. We selected 486,149 adults with confirmed SARS-CoV-2 infection and
1,944,580 propensity score-matched adults with no recorded evidence of SARS-CoV-2 infection. Outcomes included 115 indi-
vidual symptoms, as well as long COVID, defined as a composite outcome of 33 symptoms by the World Health Organization
clinical case definition. Cox proportional hazards models were used to estimate adjusted hazard ratios (aHRs) for the out-
comes. A total of 62 symptoms were significantly associated with SARS-CoV-2 infection after 12 weeks. The largest aHRs
were for anosmia (aHR 6.49, 95% CI 5.02–8.39), hair loss (3.99, 3.63–4.39), sneezing (2.77, 1.40–5.50), ejaculation
difficulty (2.63, 1.61–4.28) and reduced libido (2.36, 1.61–3.47). Among the cohort of patients infected with SARS-CoV-2,
risk factors for long COVID included female sex, belonging to an ethnic minority, socioeconomic deprivation, smoking, obe-
sity and a wide range of comorbidities. The risk of developing long COVID was also found to be increased along a gradient of
decreasing age. SARS-CoV-2 infection is associated with a plethora of symptoms that are associated with a range of sociode-
mographic and clinical risk factors.

I
nfection with SARS-CoV-2 causes an acute multisystem illness post-acute sequelae of COVID-19 (PASC) and long COVID3–5.
referred to as COVID-19 1. It is recognized that approximately The UK National Institute for Health and Care Excellence (NICE)
10% of individuals with COVID-19 develop persistent and often makes a distinction between disease occurring from 4 to 12 weeks
relapsing and remitting symptoms beyond 4 to 12 weeks after infec- after infection (ongoing symptomatic COVID-19) and symptoms
tion2. The presence of persistent symptoms in a previously infected persisting beyond 12 weeks (post-acute COVID-19 syndrome)4.
individual is commonly referred to by several terms including The World Health Organization (WHO) defines it as a condition
post-COVID-19 condition, post-acute COVID-19 syndrome, characterized by symptoms impacting everyday life, such as fatigue,

1
Institute of Applied Health Research, University of Birmingham, Birmingham, UK. 2Midlands Health Data Research UK, Birmingham, UK. 3DEMAND
Hub, University of Birmingham, Birmingham, UK. 4Centre for Patient-Reported Outcomes Research, Institute of Applied Health Research, University
of Birmingham, Birmingham, UK. 5National Institute for Health and Care Research (NIHR) Applied Research Collaboration (ARC) – West Midlands,
Birmingham, UK. 6Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK. 7NIHR
Birmingham-Oxford Blood and Transplant Research Unit (BTRU) in Precision Transplant and Cellular Therapeutics, University of Birmingham, Birmingham,
UK. 8Clinical Practice Research Datalink, Medicines and Healthcare products Regulatory Agency, London, UK. 9School of Sport, Exercise and Rehabilitation
Sciences, University of Birmingham, Birmingham, UK. 10Institute of Health Informatics, Faculty of Population Health Sciences, University College London,
London, UK. 11Warwick Medical School, University of Warwick, Coventry, UK. 12School of Oral and Dental Sciences, University of Bristol, Bristol, UK. 13NIHR
Surgical Reconstruction and Microbiology Research Centre, University Hospital Birmingham and University of Birmingham, Birmingham, UK. 14Institute
of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK. 15NIHR Birmingham Biomedical Research Centre, University Hospital
Birmingham and University of Birmingham, Birmingham, UK. 16University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK. 17Centre for
Trauma Science Research, University of Birmingham, Birmingham, UK. 18MRC-Versus Arthritis Centre for Musculoskeletal Ageing Research, Institute of
Inflammation and Ageing, University of Birmingham, Birmingham, UK. 19PIONEER HDR-UK Data Hub in acute care, University of Birmingham, Birmingham,
UK. 20UK SPINE, University of Birmingham, Birmingham, UK. 21Institute of Immunology and Immunotherapy, University of Birmingham, Birmingham, UK.
22
Institute for Mental Health, University of Birmingham, Birmingham, UK. 23Birmingham and Solihull Mental Health NHS Foundation Trust, Birmingham,
UK. 24Aparito Ltd, Wrexham, UK. 25Patient and public involvement member, Birmingham, UK. ✉e-mail: [email protected]

1706 Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine


NATURE MEDICInE Articles
shortness of breath and cognitive dysfunction, which occur after a the cohort of patients with no recorded evidence of SARS-CoV-2
history of probable or confirmed SARS-CoV-2 infection6. Symptoms infection.
usually occur 3 months from the onset of acute COVID-19 The cohorts of patients were well matched in terms of sociode-
symptoms, last for at least 2 months and cannot be explained by an mographic characteristics, smoking status, body mass index (BMI),
alternative diagnosis. comorbidities and baseline recording of symptoms, indicated by
Long COVID has been associated with a broad range of standardized mean difference (SMD) < 0.1 for all variables (Table 1
symptoms and health impacts5,7–9. A previous study showed and Supplementary Table 1). The mean age was 43.8 years (s.d.
that symptoms of long COVID, although commonly observed 16.9), and 55.3% of participants were female. Of the participants,
among patients with other viral infections such as influenza, 64.7% were white, 12.2% were Asian origin from India, Pakistan,
occur more frequently following infection with SARS-CoV-210. China, Cambodia, Thailand, Vietnam, Malaysia, Sri Lanka, Nepal,
Several systematic reviews have shown the most prevalent symp- Bangladesh, Japan or Taiwan, 4.0% were Black Afro-Caribbean and
toms to be fatigue, shortness of breath, muscle pain, joint pain, 16.2% had missing ethnicity data. Overall, 53.8% were overweight
headache, cough, chest pain, altered smell, altered taste and or obese (with BMI data missing for 13.0%), and 22.5% were current
diarrhea9,11–13; however, previous studies were often based on smokers (with smoking data missing for 4.3%).
self-reported symptoms or lacked a control group, making it dif- The most common comorbidities were depression (22.1%), anxi-
ficult to make inferences about whether the reported symptoms ety (20.3%), asthma (20.1%), eczema (19.5%) and hay fever (18.1%).
were due to SARS-CoV-2 infection, pre-existing comorbidities A full list of comorbidities is provided in Supplementary Table 1.
or societal effects related to the pandemic. Furthermore, many Overall, 56.6% of the patients infected with SARS-CoV-2 had been
previous studies were conducted in hospitalized cohorts14,15, and diagnosed in 2020 and 43.4% in 2021. 4.5% of the patients infected
population-level data on the potential breadth of symptoms expe- with SARS-CoV-2 and 4.7% of the patients with no recorded evi-
rienced by non-hospitalized individuals with SARS-CoV-2 infec- dence of SARS-CoV-2 infection had received at least a single dose
tion are scarce. Large-scale studies leveraging routinely available of a COVID-19 vaccine before the index date. The most common
healthcare data with closely matched control populations are vaccine before the index date was the BNT162b2 (BioNTech-Pfizer;
needed to elucidate which symptoms are independently associ- 2.8%) followed by ChAdOx1 nCoV-19 (Oxford-AstraZeneca; 1.7%).
ated with the long-term effects of COVID-19.
There is also a need to gain a better understanding of the risk Symptoms. In the 3–12-month period before the index date,
factors that contribute toward the development of long COVID, the reporting of symptoms between the patients infected with
which was highlighted as a research priority on the recently updated SARS-CoV-2 and propensity score-matched cohort of patients with
NICE guideline on managing the long-term effects of COVID- no recorded evidence of SARS-CoV-2 infection were similar. Of
194. Previous studies suggested that higher risk of developing long the 115 symptoms, statistically significant differences between the
COVID was observed with a gradient increase in age, female sex, two groups at baseline were observed only for bowel incontinence
hospital admission during acute COVID-19 (including the need for and sore throat, after adjustment for age, sex, ethnic group, socio-
oxygen therapy), symptom load (including dyspnea at presentation economic status, smoking status and BMI using logistic regression
and chest pain), abnormal auscultation findings and the presence (Supplementary Table 2).
of comorbidities such as asthma16–19. Large-scale population-based At 12 weeks after the index date, a history of SARS-CoV-2 infec-
studies with appropriate control groups are required to assess tion was significantly associated with a total of 62 symptoms, after
the long-term symptoms that are specifically attributable to adjustment for age, sex, ethnic group, socioeconomic status, smok-
SARS-CoV-2 infection and their association with a wide range of ing, BMI and baseline symptoms (Supplementary Table 3a). These
demographic and clinical risk factors in non-hospitalized individu- 62 symptoms spanned 14 of the 15 domains considered (Fig. 1).
als. Such studies are needed to understand the breadth of symptoms Of the patients with a minimum of 12 weeks of follow-up, 20,864
that contribute to long COVID to inform clinical management and out of 384,137 (5.4%) patients infected with SARS-CoV-2 and
help healthcare providers identify population groups at higher risk 65,293 out of 1,501,689 (4.3%) patients with no recorded evidence
of reporting persistent symptoms. of SARS-CoV-2 infection reported at least one of the symptoms
Here, we did a large-scale analysis of primary care data from the included in the WHO case definition for long COVID (aHR 1.26,
UK to investigate a comprehensive range of symptoms previously 95% CI 1.25–1.28) (Supplementary Table 3a). Patients infected with
reported to be associated with long COVID by epidemiological stud- SARS-CoV-2 were more likely to report more than one symptom
ies, patients and clinicians. We aimed to assess their association with after 12 weeks from the index date compared to patients with no
confirmed SARS-CoV-2 infection at least 12 weeks after infection in recorded evidence of SARS-CoV-2 infection (one symptom (5.6%
non-hospitalized adults, compared to a propensity score-matched versus 4.7%), two symptoms (3.6% versus 2.9%) and three or more
cohort of patients with no recorded evidence of SARS-CoV-2 infec- symptoms (4.9% versus 4.0%)) (Supplementary Table 3b).
tion. We also assessed associations between demographic and clini- The symptoms with the largest aHRs were anosmia (aHR 6.49,
cal risk factors, including comorbidities, with the development of 95% CI 5.02–8.39), hair loss (3.99, 3.63–4.39), sneezing (2.77, 1.40–
long COVID and characterized dominant symptom clusters. 5.50), ejaculation difficulty (2.63, 1.61–4.28), reduced libido (2.36,
1.61–3.47), shortness of breath at rest (2.20, 1.57–3.08), fatigue
Results (1.92, 1.81–2.03), pleuritic chest pain (1.86, 1.41–2.46), hoarse voice
Participants. A total of 486,149 non-hospitalized individuals had (1.78, 1.44–2.20) and fever (1.75, 1.54–1.98).
a coded record of SARS-CoV-2 infection, and 8,030,224 had no The association of SARS-CoV-2 infection with these 62 signifi-
records of either suspected or confirmed COVID-19 during the cantly associated symptoms was even larger at 0–4 weeks and 4–12
study period between 31 January 2020 and 15 April 2021. From weeks, and the size of the aHRs reduced with increasing time from
the pool of patients with no recorded evidence of SARS-CoV-2 the index date. A full list of the aHRs for all 115 symptoms included
infection, 1,944,580 individuals were propensity score-matched to in the analysis at 0–4 weeks, 4–12 weeks and beyond 12 weeks is
patients infected with SARS-CoV-2. Kernel density plots of the pro- presented in Supplementary Tables 3a, 4 and 5.
pensity scores of both the cohorts, before and after matching, are In the post hoc-subgroup analysis of patients infected during the
presented in Extended Data Fig. 1. The total follow-up time was first and second surges of the pandemic in the UK (31 January 2020
0.29 years (interquartile range (IQR) 0.24–0.42) for the cohort of to 31 August 2020 and 1 September 2020 to 15 April 2021, when
patients infected with SARS-CoV-2 and 0.29 (IQR 0.24–0.41) for the dominant variant of concern was B.1.1.7) and their propensity

Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine 1707


Articles NATURE MEDICInE

Table 1 | Baseline characteristics of patients infected with SARS-CoV-2 and propensity-matched comparator cohort of patients with
no recorded evidence of SARS-CoV-2 infection
Cohort of patients infected with Comparator cohort Standardized
SARS-CoV-2 (n = 486,149) (n = 1,944,580) differences
Mean age at index date (s.d.) 44.1 (17.0) 43.8 (16.9) 0.015
Sex, n (%)
Female 268,367 (55.2) 1,075,963 (55.3) 0.003
Male 217,782 (44.8) 868,617 (44. 7)
Ethnic group, n (%)
White 313,561 (64.5) 1,258,392 (64.7) 0.004
Asiana 59,477 (12.2) 237,133 (12.2)
Black Afro-Caribbean 19,835 (4.1) 78,501 (4.0)
Mixed ethnicity 7,357 (1.5) 29,614 (1.5)
Otherb 6,896 (1.4) 26,966 (1.4)
Missing 79,023 (16.3) 313,974 (16.2)
Socioeconomic status IMD quintile, n (%) 0.003
1 (least deprived) 82,538 (17.0) 331,229 (17.0)
2 86,164 (17.7) 346,054 (17.8)
3 89,470 (18.4) 358,650 (18.4)
4 106,578 (21.9) 426,153 (21.9)
5 (most deprived) 112,656 (23.2) 448,126 (23.0)
Missing 8,743 (1.8) 34,368 (1.8)
BMI (kg m−2), n (%)
<18.5 13,261 (2.7) 52,322 (2.7) 0.001
18.5–25 148,295 (30.5) 590,747 (30.4)
25–30 138,771 (28.5) 558,287 (28.7)
>30 121,943 (25.1) 489,389 (25.2)
Missing 63,879 (13.1) 253,835 (13.1)
Smoking status
Never smoked 177,064 (36.4) 714,045 (36.7) 0.009
Ex-smoker 176,899 (36.4) 710,255 (36.5)
Current smoker 110,848 (22.8) 436,212 (22.4)
Missing 21,338 (4.4) 84,068 (4.3)
Comorbidities
Depression 107,392 (22.1) 428,797 (22.1) 0.001
Anxiety 98,849 (20.3) 395,365 (20.3) 0.000
Asthma 97,509 (20.1) 390,401 (20.1) 0.000
Eczema 94,313 (19.4) 378,604 (19.5) 0.002
Hay fever 87,691 (18.0) 352,090 (18.1) 0.002
Hypertension 73,901 (15.2) 291,389 (15.0) 0.006
Migraine 53,881 (11.1) 215,733 (11.1) 0.000
Osteoarthritis 53,694 (11.0) 211,062 (10.9) 0.006
Fragility fracture 46,608 (9.6) 186,194 (9.6) 0.000
Arrhythmias 34,811 (7.2) 136,280 (7.0) 0.006
Calendar year of index date, n (%)
2020 275,169 (56.6) 1,077,126 (55.4) 0.024
2021 210,980 (43.4) 867,454 (44.6)
COVID-19 vaccine status at index date, n
(%)
Vaccine dose 1 21,932 (4.5) 92,355 (4.7) 0.013
Vaccine dose 2 685 (0.1) 5,964 (0.3) 0.035
ChAdOx1-S 8,210 (1.7) 32,183 (1.7) 0.003
BNT162b2 12,792 (2.6) 56,559 (2.9) 0.017
CX-024414 0 (0) 3 (0) 0.002
Socioeconomic status measured using the Index of Multiple Deprivation (IMD); standardized difference of less than 0.1 indicates a relatively small imbalance. Cohort of patients with SARS-CoV-2 infection
included participants with a positive PCR with reverse transcription (RT–PCR) or antigen test for SARS-CoV-2. The comparator cohort included participants with no records of either confirmed or suspected
COVID-19. aThe Asian category consisted of participants with origin from all over Asia, including India, Pakistan, China, Cambodia, Thailand, Vietnam, Malaysia, Sri Lanka, Nepal, Bangladesh, Japan or
Taiwan. bThe ‘other’ ethnicity category consisted of patients with native American, Middle Eastern or Polynesian origin.

1708 Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine


NATURE MEDICInE Articles
score-matched patients, the association between SARS-CoV-2 consolidated symptoms (Supplementary Table 8) among patients
infection and the reported symptoms is more pronounced among with SARS-CoV-2 infection who reported at least one of the 62
those infected during the second wave of the pandemic. For exam- symptoms associated with COVID-19 beyond 12 weeks after infec-
ple, SARS-CoV-2 infection during the first surge of the pandemic tion (n = 50,832; Extended Data Fig. 5). Latent class proportions
was associated with only a 28% relative increase in the reporting and the probabilities of symptoms conditional to class membership
of cough after 12 weeks from the index date compared to propen- (ρ) are given in Supplementary Table 9a,b. A word cloud of symp-
sity score-matched patients (aHR 1.28, 95% CI 1.21–1.36), whereas tom names was generated for the three classes, where the text size of
infection during the second wave was associated with a 77% rela- the symptoms is directly proportional to the ρ parameter (Extended
tive increase in the reporting of cough compared to corresponding Data Fig. 6). Among patients with SARS-CoV-2 infection with per-
propensity score-matched patients (aHR 1.77, 95% CI 1.60–1.93). sistent symptoms, 80.0% belonged to class 1 (dominated by a broad
Similar trends were also observed for sneezing, rash, itchy skin, spectrum of symptoms including pain, fatigue and rash), 5.8%
fever and allergies (Extended Data Figs. 2–4). to class 2 (dominated by cough, shortness of breath and phlegm)
and 14.2% to class 3 (dominated by depression, anxiety, insomnia
Risk factors for long COVID symptoms. The risk factor analysis and brain fog).
included 384,137 individuals infected with SARS-CoV-2 with a The baseline characteristics of patients within each of the latent
minimum of 12 weeks of follow-up. When using the WHO defini- classes is presented in Supplementary Table 10. A multinomial
tion of long COVID, several sociodemographic and clinical risk fac- logistic regression model was performed for the polytomous class
tors were significantly associated with the incidence of long COVID membership outcome among those with SARS-CoV-2 infection
(Table 2 and Supplementary Table 6). Women were at increased risk (Supplementary Table 11). Patients from all the classes were more
compared to men (aHR 1.52, 95% CI 1.48–1.56). Older age above likely to be socioeconomically deprived and to be women compared
30 years was associated with a higher risk of reporting long COVID to patients without persistent symptoms. Compared to patients
symptoms in the univariate analysis; however, after adjusting for without persistent symptoms, members of latent class 3 (dominated
baseline covariates, older age was associated with a lower risk, with by anxiety, depression, insomnia and brain fog) were more likely to
those aged 30–39 years having a 6% lower risk (0.94, 0.90–0.97) and be younger, whereas members of the other latent classes were more
those aged ≥70 years having a 25% lower risk (0.75, 0.70–0.81) com- likely to be older compared to patients without persistent symp-
pared to those aged 18–30 years. toms. Members of latent class 2 and 3 were more likely to be white,
There were associations between the risk of reporting long whereas members of latent class 1 (dominated by a broad spectrum
COVID symptoms and certain ethnic minority groups in the mul- of symptoms including pain, fatigue and rash) were more likely to
tivariable model, with increased risks seen in Black Afro-Caribbean be of Asian origin or from other ethnic minority groups.
ethnic groups (aHR 1.21, 95% CI 1.10–1.34), mixed ethnicity
(1.14, 1.07–1.22) and other minority ethnic groups comprising of Discussion
patients with native American, Middle Eastern or Polynesian ori- Individuals with confirmed SARS-CoV-2 infection were at
gin (1.06, 1.03–1.10), as compared to white ethnic groups. The risk increased risk of reporting a wide range of symptoms at ≥12 weeks
also increased with increasing levels of socioeconomic deprivation, after infection, compared to propensity score-matched patients with
with a 11% increased risk (1.11, 1.07–1.16) in those who were most no record of suspected or confirmed SARS-CoV-2 infection, after
socioeconomically deprived compared to those least deprived. accounting for both sociodemographic and clinical characteristics
Smokers and former smokers were at increased risk of report- and the reporting of symptoms before infection. The symptoms
ing long COVID symptoms (aHR 1.12, 95% CI 1.08–1.15 and 1.08, most associated with SARS-CoV-2 infection included some that are
1.05–1.11, respectively), compared to those who had never smoked. already recognized in previous studies12, such as anosmia, shortness
Baseline BMI in the overweight or obese range was also associated of breath, chest pain and fever, but also included a range of other
with an increased risk of persistent symptoms, with those who had a symptoms that have previously not been widely reported such as
BMI of greater than 30 kg m−2 having a 10% relative increase in risk hair loss and sexual dysfunction. Previous SARS-CoV-2 infection
of reporting long COVID symptoms compared to those with a BMI was independently associated with the reporting to primary care
of 18.5–25 kg m−2 (aHR 1.10, 1.07–1.14). of 20 of the 33 symptoms included in the WHO case definition
A wide range of comorbidities at baseline were also associated with and an additional 42 symptoms, beyond 12 weeks from infection.
an increased risk of long COVID symptoms. The comorbidities with SARS-CoV-2 infection was associated with a 26% relative increase
the largest associations were COPD (aHR 1.55, 95% CI 1.47–1.64), in risk of reporting at least one of the symptoms included in the
benign prostatic hyperplasia (1.39, 1.28–1.52), fibromyalgia (1.37, WHO case definition for long COVID.
1.28–1.47), anxiety (1.35, 1.31–1.39), erectile dysfunction (1.33, Among those with a history of confirmed SARS-CoV-2 infec-
1.26–1.41), depression (1.31, 1.27–1.34), migraine (1.26, 1.22–1.30), tion, several risk factors were associated with reporting symptoms
multiple sclerosis (1.26, 1.03–1.53), celiac disease (1.25, 1.09–1.43) 12 weeks or more after infection. Female sex, a gradient of decreas-
and learning disability (1.24, 1.11–1.40). A full list of the aHRs for the ing age, belonging to a Black, mixed ethnicity or other ethnic
included comorbidities is provided in Supplementary Table 6. minority group, socioeconomic deprivation, smoking, high BMI
When using our alternative definition of long COVID that and the presence of a wide range of comorbidities were associated
consisted of having at least one of the symptoms that were statisti- with increased risk of both symptoms included in the WHO defi-
cally associated with a history of SARS-CoV-2 infection ≥12 weeks nition of long COVID and symptoms statistically associated with
after infection, the risk factor patterns were largely still observed SARS-CoV-2 infection reported 12 weeks or more after infection.
(Supplementary Table 7). Females, ethnic minority groups, increas- Among those with a confirmed SARS-CoV-2 infection and
ing socioeconomic deprivation, smoking and former smoking, high who reported at least one symptom that was statistically associ-
BMI and a wide range of comorbidities were all associated with an ated with SARS-CoV-2 infection at least 12 weeks after infection,
increased risk of reporting symptoms ≥12 weeks after infection. three major clusters of phenotypes of long COVID were observed.
Risk of reporting symptoms was also found to be increased along a These included patients with symptoms dominated by (1) a broad
gradient of decreasing age. spectrum of symptoms, including pain, fatigue and rash (80.0%);
(2) respiratory symptoms, including cough, shortness of breath and
Symptom clusters among patients with long COVID. A three-class phlegm (5.8%); and (3) mental health and cognitive symptoms,
model achieved the optimal fit in a latent class analysis of 50 including anxiety, depression, insomnia and brain fog (14.2%).

Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine 1709


Articles NATURE MEDICInE

Table 2 | Risk factors associated with the development of long COVID (WHO definition)
Risk factor Total numbers per strata Long COVID symptoms Unadjusted HR Adjusted HRa (95% CI)
(n = 384,137) (n = 29,869) (7.78) n (%) (95% CI)
Sex
Men 171,593 9,090 (5.3) Ref. Ref.
Women 212,544 20,779 (9. 8) 1.86 (1.81–1.90) 1.52 (1.48–1.56)
Age (years)
18–29 95,969 6,932 (7.2) Ref. Ref.
30–39 78,302 5,805 (7.4) 1.13 (1.10–1.18) 0.94 (0.90–0.97)
40–49 75,349 5,784 (7.7) 1.14 (1.10–1.18) 0.89 (0.86-0.93)
50–59 73,262 5,485 (7.5) 1.07 (1.04–1.11) 0.80 (0.77–0.83)
60–69 35,932 2,790 (7.8) 1.09 (1.05–1.14) 0.74 (0.70–0.78)
≥70 25,323 3,073 (12.1) 1.39 (1.33–1.45) 0.75 (0.70–0.81)
Ethnicity
White 246,717 20,462 (8.3) Ref. Ref.
Asianb 47,788 3,647 (7.6) 0.90 (0.82–0.99) 0.99 (0.89–1.09)
Black 15,846 1,053 (6.7) 1.01 (0.91–1.11) 1.21 (1.10–1.34)
Mixed 5,976 407 (6.8) 0.98 (0.92–1.04) 1.14 (1.07–1.22)
Otherc 5,438 404 (7.4) 0.94 (0.91–0.97) 1.06 (1.03–1.10)
Missing 62,372 3,896 (6.3) 0.74 (0.71–0.76) 0.92 (0.88–0.95)
BMI (kg m−2)
<18.5 10,312 762 (7.4) 0.93 (0.86–1.00) 0.93 (0.86–1.00)
18.5–25 117,630 8,849 (7.5) Ref. Ref.
25–30 109,707 8,612 (7.9) 1.06 (1.03–1.09) 1.07 (1.04–1.10)
>30 95,799 9,233 (9.6) 1.29 (1.25–1.33) 1.10 (1.07–1.14)
Missing 50,689 2,413 (4.8) 0.63 (0.60–0.65) 0.91 (0.86–0.95)
Smoking status
Non-smoker 141,967 9,671 (6.8) Ref. Ref.
Ex-smoker 139,294 12,407 (8.9) 1.33 (1.29–1.36) 1.08 (1.05–1.11)
Current smoker 85,765 7,072 (8.3) 1.31 (1.27–1.35) 1.12 (1.08–1.15)
Missing 17,111 719 (4.2) 0.61 (0.56–0.65) 0.90 (0.83–0.97)
Socioeconomic status quintile
(IMD)
1 (least deprived) 66,564 4,392 (6.6) Ref. Ref.
2 68,657 4,963 (7.2) 1.09 (1.05–1.13) 1.05 (1.00–1.09)
3 70,699 5,486 (7.8) 1.19 (1.14–1.24) 1.10 (1.05–1.14)
4 84,002 6,523 (7.8) 1.20 (1.16–1.25) 1.07 (1.03–1.11)
5 (most deprived) 87,270 7,883 (9.0) 1.33 (1.28–1.38) 1.11 (1.07–1.16)
Missing 6,945 622 (9.0) 1.28 (1.7–1.39) 1.10 (1.01–1.20)
Symptoms recorded before 78,880 13,207 (16.7) 2.92 (2.85–2.99) 2.07 (2.02–2.12)
COVID-19
Comorbidities
COPD 8,040 1,741 (21.7) 2.71 (2.58–2.85) 1.55 (1.47–1.64)
BPH 4,961 596 (12.0) 1.39 (1.28–1.51) 1.39 (1.28–1.52)
Fibromyalgia 4,031 900 (22.3) 3.17 (2.97–3.39) 1.37 (1.28–1.47)
Anxiety 77,753 10,481 (13.5) 2.17 (2.12–2.23) 1.35 (1.31–1.39)
Erectile dysfunction 16,678 1,551 (9.3) 1.15 (1.09–1.21) 1.33 (1.26–1.41)
Depression 83,903 11,222 (13.4) 2.22 (2.17–2.27) 1.31 (1.27–1.34)
Migraine 43,043 5,597 (13.0) 1.88 (1.83–1.94) 1.26 (1.22–1.30)
Multiple sclerosis 791 98 (12.4) 1.52 (1.25–1.85) 1.26 (1.03–1.53)
Celiac disease 1,669 207 (12.4) 1.58 (1.38–1.81) 1.25 (1.09–1.43)
Continued

1710 Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine


NATURE MEDICInE Articles
Table 2 | Risk factors associated with the development of long COVID (WHO definition) (Continued)
Risk factor Total numbers per strata Long COVID symptoms Unadjusted HR Adjusted HRa (95% CI)
(n = 384,137) (n = 29,869) (7.78) n (%) (95% CI)
Learning disability 3,283 295 (9.0) 1.22 (1.09–1.37) 1.24 (1.11–1.40)
IBS 27,492 3,691 (13.4) 1.84 (1.78–1.91) 1.20 (1.15–1.24)
Endometriosis 5,727 800 (14.0) 1.92 (1.79–2.06) 1.19 (1.11–1.28)
Low Hb 20,039 2,683 (13.4) 1.78 (1.71–1.85) 1.18 (1.13–1.23)
Deafness 3,767 514 (13.6) 1.53 (1.40–1.67) 1.16 (1.06–1.27)
Eating disorder 3,488 504 (14. 5) 1.92 (1.75–2.09) 1.16 (1.06–1.27)
Substance misuse 6,449 775 (12.0) 1.69 (1.58–1.82) 1.15 (1.07–1.23)
Back pain 5,483 718 (13.1) 1.76 (1.64–1.90) 1.15 (1.07–1.24)
Asthma 76,946 8,527 (11.1) 1.59 (1.55–1.63) 1.15 (1.12–1.18)
Chronic sinusitis 6,838 873 (12.8) 1.63 (1.52–1.74) 1.14 (1.07–1.22)
PCOS 9,599 1,166 (12.2) 1.73 (1.63–1.84) 1.14 (1.07–1.21)
a
aHRs estimated using a multivariable Cox proportional hazards model, including age, sex, ethnic group, socioeconomic status, index year, vaccination status, symptoms recorded before COVID-19 and
comorbidities. bThe Asian category consisted of participants with origin from all over Asia including India, Pakistan, China, Cambodia, Thailand, Vietnam, Malaysia, Sri Lanka, Nepal, Bangladesh, Japan or
Taiwan. cThe other ethnicity category consisted of patients with native American, Middle Eastern or Polynesian origin. COPD, chronic obstructive pulmonary disease; BPH, benign prostatic hyperplasia; IBS,
irritable bowel syndrome; Hb, hemoglobin; PCOS; polycystic ovary syndrome; Ref., reference.

A key strength of the study is the large sample size, which infected with SARS-CoV-2 and patients with no recorded evidence
included 486,149 adults with a confirmed diagnosis of SARS-CoV-2 of SARS-CoV-2 infection. Conversely, with the evolving aware-
infection and 1.9 million propensity score-matched patients with ness of long COVID, it is possible that patients with a history of
no recorded evidence of SARS-CoV-2 infection. The large sample COVID-19 may have been more likely than those without to access
size provided adequate statistical power to assess differences in the primary care and alert clinicians of their symptoms, which could
reporting of a wide range of symptoms between the two cohorts potentially lead to an inflation of the observed effect sizes. This is
and estimation of the association between reporting of symptoms potentially supported by the increased aHRs observed for symp-
and important sociodemographic and clinical risk factors with toms such as cough, sneezing, fever and allergies among patients
a high level of precision. Another key strength of the study is the who were infected during the second surge of the pandemic, com-
inclusion of a comparator group that did not have either suspected pared to those infected during the first surge, although this could
or confirmed SARS-CoV-2 infection and had been propensity also potentially be attributed to other reasons, such as changes in
score-matched for sociodemographic factors, previously reported the dominant variants.
symptoms and over 80 comorbidities. This enabled us to assess the Another limitation of the study is potential misclassification
independent association between exposure to SARS-CoV-2 and the bias. Community testing for SARS-CoV-2 was very limited dur-
reporting of symptoms ≥12 weeks after infection, after accounting ing the first surge of the pandemic, and many hospitalized indi-
for many important confounders. A further strength is the large viduals who were not hospitalized with COVID-19 were not tested.
number of symptoms included in the analysis, which was based Furthermore, antigen test positive results may not be routinely
on a previous systematic review of the literature11, a scoping review coded within primary care. There is some evidence that as much as
of long COVID questionnaires and an extensive consultation with 20–30% of SARS-CoV-2 test positive cases may be missing from pri-
patients and clinicians20]. Symptom code lists were developed rig- mary care records22,23. It is therefore possible that some members of
orously with systematic searches for relevant SNOMED CT codes our propensity score-matched comparator cohort had been infected
with extensive clinical input. We also assessed the outcome of long with SARS-CoV-2 but had simply not been tested or coded as con-
COVID using the WHO case definition as well as a new definition firmed COVID-19 within primary care. We attempted to account
that incorporated symptoms that were statistically associated with a for this bias by excluding individuals from the comparator cohort if
history of SARS-CoV-2 infection. they had a coded diagnosis of suspected COVID-19; however, this
A key limitation of the study is the use of routinely coded health- is unlikely to be completely sensitive in identifying individuals with
care data. Coded symptom data in primary care records is likely unverified SARS-CoV-2 infection from the comparator cohort,
to underrepresent the true symptom burden experienced by indi- which would potentially have the effect of attenuating the observed
viduals with long COVID. This could be due to reduced access to effect sizes. Similarly, it is possible that some members of our cohort
primary care (especially during the first surge of the pandemic), were hospitalized, as we were limited to using SNOMED CT codes
patients not consulting their general practitioner (GP) about symp- for hospitalization within primary care records rather than using
toms or the reason for the GP consultation being unrelated to linked Hospital Episode Statistics data, of which timely access was
COVID-19, thereby leading patients to underreport the full extent unavailable for our study.
and breadth of their symptoms. In addition, much of a patient’s Finally, we were unable to incorporate all aspects of the WHO
clinical history, in terms of the symptoms reported, are recorded clinical case definition for long COVID, such as ‘impact on everyday
as free text, rather than as SNOMED CT codes21. The symptom functioning’ due to the lack of data on these domains within coded
data we used for the study thus cannot be used to make infer- primary care data. Our findings support the results from our pre-
ences about the absolute prevalence of these symptoms; however, vious systematic review and meta-analysis on long COVID symp-
as this underrepresentation would be expected to affect both the toms11. That review found the most prevalent symptoms to be fatigue,
infected and propensity score-matched comparator cohorts equally, shortness of breath, muscle pain, joint pain, headache, cough, chest
the data used in the present analysis can still be used to examine pain, altered sense of smell, altered taste and diarrhea. Our current
relative differences in the reporting of symptoms between patients analysis was not able to assess symptom prevalence but rather the

Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine 1711


Articles NATURE MEDICInE

Symptoms aHR (95% CI)* shortness of breath, fatigue and chest pain to be symptoms signifi-
Breathing
Shortness of breath at rest
cantly associated with SARS-CoV-2 infection. By contrast, we also
2.20 (1.57–3.08)
Wheezing
Shortness of breath
1.42 (1.27–1.59) identified new symptoms such as hair loss, sneezing, symptoms
1.31 (1.24–1.38)
Shortness of breath on exertion 1.26 (1.18–1.33) of sexual dysfunction (difficulties ejaculating and reduced libido),
Pain
hoarse voice and fever as significantly associated. Also, like our
Pleuritic chest pain
Chest pain 1.86 (1.41–2.46) review11, we found that female sex and the presence of a range of
Pain 1.42 (1.35–1.50)
1.25 (1.23–1.27) comorbidities were associated with an increased risk of developing
persistent symptoms; however, it is likely that pre-existing comor-
Circulation
Palpitations bidities may have influenced the likelihood of GP consultations and
1.53 (1.40–1.67)
Tachycardia
Limb swelling
1.44 (1.21–1.72)
1.25 (1.15–1.36)
symptom reporting.
In contrast to our review, the present analysis found that risk of
Fatigue reporting symptoms at ≥12 weeks after infection increased along a
Fatigue 1.92 (1.81–2.03)
gradient of decreasing age in our cohort. This could partly be due
Cognitive health to the adjustment for an extensive range of comorbidities or the dif-
Brain fog 1.37 (1.17–1.59) ferences in the populations studied. Most studies included in our
review were based on hospitalized cohorts, whereas our present
Sleep
Insomnia 1.34 (1.23–1.46) study excluded hospitalized patients. Older patients with COVID-19
were more likely to be hospitalized than younger patients and, there-
Ear, nose and throat
Anosmia 6.49 (5.02–8.39)
fore, to be excluded from our study. Older non-hospitalized patients
Sneezing
Hoarse voice
2.77 (1.40–5.50)
1.78 (1.44–2.20)
might, therefore, have had mild disease with low symptom burden.
Dysphagia
Cough
1.60 (1.41–1.82)
1.44 (1.37–1.51)
We also found that patients from Black, mixed ethnicity and
Nasal congestion
Phlegm
1.34 (1.20–1.50)
1.33 (1.19–1.48)
other minority ethnic backgrounds were at increased risk of per-
Ear pain 1.16 (1.06–1.27) sistent symptoms. This contradicts the findings from the analy-
Stomach and digestion
sis of the COVID-19 Infection Survey data, which found a lower
Bowel incontinence
Vomitting
1.58 (1.33–1.88) prevalence of long COVID among all ethnic minority subgroups
1.48 (1.31–1.67)
Nausea
Weight loss
1.37 (1.20–1.56) compared to those of white ethnicity24; however, the COVID-19
1.34 (1.17–1.53)
Bloating
Diarrhoea
1.31 (1.14–1.50) Infection Survey analysis included children, was restricted to those
1.29 (1.19–1.41)
Abdominal pain
Constipation
1.21 (1.16–1.27) living in private residences and considered self-reported diagnosis
1.20 (1.11–1.30)
Gastritis 1.20 (1.07–1.36) of long COVID, defined as unexplained persistence of symptoms, 4
weeks after SARS-CoV-2 infection.
Muscles and joints
Asthenia 1.45 (1.07–1.96) An international online cohort study of people with confirmed
Parasthesia
Joint pain
1.34 (1.20–1.50)
1.23 (1.13–1.35) and suspected long COVID found that respondents reported an
average of 56 symptoms across an average of nine organ systems8. A
Mental health and wellbeing
Anhedonia 1.36 (1.16–1.59)
Norwegian prospective study of 312 home-isolated patients found
Anorexia
Anxiety
1.28 (1.08–1.52)
1.12 (1.08–1.16)
persistent symptoms 6 months after infection25. Both studies were
Depression 1.09 (1.05–1.13)
comprehensive analyses of symptom burden but lacked a control
Hair, skin and nails
group and were therefore unable to make strong inferences about
Hair loss 3.99 (3.63–4.39) the relative contribution of SARS-CoV-2 infection to these symp-
Itchy skin 1.34 (1.19–1.51)
Dry and scaly skin 1.30 (1.10–1.53) toms over and above pre-existing health conditions or psychosocial
Rash 1.24 (1.17–1.32)
Nail changes 1.20 (1.05–1.37) effects related to the pandemic; however, like these studies, we also
found that individuals with a history of confirmed SARS-CoV-2
Eyes
Red eye 1.34 (1.19–1.51) reported a broad range of symptoms, with a total of 62 symptoms
Dry eye 1.28 (1.10–1.49)
being associated at 12 or more weeks after infection. We were also
Reproductive health
able to control for potential confounders, including whether the
Ejaculation difficulty
Reduced libido
2.63 (1.61–4.28)
2.36 (1.61–3.47)
symptoms of interest were reported before infection.
Erectile dysfunction
Vaginal discharge
1.26 (1.10–1.44)
1.26 (1.17–1.35)
The COVID Symptom Study provided data on self-reported
Menorrhagia 1.18 (1.07–1.31) symptoms among participants enrolled on an app16. Among those
with symptoms persisting 28 d or longer after infection, key symp-
Other symptoms
Fever 1.75 (1.54–1.98) toms included fatigue, headache, dyspnea and anosmia, which were
Mouth ulcer
Urinary retention
1.60 (1.32–1.95)
1.58 (1.28–1.97) all also significantly associated at ≥12 weeks in our cohort. The
1.56 (1.18–2.07)
Dry mouth
Hot flushes 1.52 (1.27–1.84) COVID Symptom Study also found that long COVID was associ-
Body ache 1.46 (1.22–1.75)
Hemoptysis 1.39 (1.05–1.85) ated with increasing BMI and female sex, which is in keeping with
Urinary incontinence 1.37 (1.24–1.52)
Allergies 1.30 (1.19–1.42)
1.30 (1.24–1.35)
our findings; however, the study also found that the risk of report-
Headache
Polyuria 1.27 (1.15–1.40)
1.23 (1.13–1.33)
ing long COVID symptoms increased with age, whereas our study
Dizziness
Vertigo 1.15 (1.03–1.28) observed the opposite trend after adjustment for a comprehensive
range of potential confounders. Although the COVID Symptom
Study is community-based, it includes individuals with a history
0.25 1.00 4.00 20.00
of hospitalized and non-hospitalized COVID-19, so the reasons
*Adjusted for age, sex, BMI, ethnicity, smoking status, deprivation status and symptom of interest at baseline
cohort of patients infected with SARS CoV-2 (n = 384,137); comparator cohort (n = 1,501,689)
for the discrepant age trend may be due to the exclusion of older
patients in our study who are more likely to be hospitalized.
Fig. 1 | Symptoms associated with SARS-CoV-2 ≥ 12 weeks after infection. One of the largest population-based surveys on COVID-19
and long COVID is the UK Office for National Statistics COVID
Infection Survey26. This survey estimated that as of 7 April 2022,
relative difference in symptoms between a large sample of indi- 1.7 million people living in private households in the UK (2.7% of
viduals with and without recorded evidence of SARS-CoV-2 infec- the population) were experiencing symptoms persisting beyond
tion at ≥12 weeks after infection. We similarly identified anosmia, 4 weeks from SARS-CoV-2 infection and with 70% experiencing

1712 Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine


NATURE MEDICInE Articles
symptoms beyond 12 weeks. Fatigue, shortness of breath, anosmia to describe the natural history of long COVID and characterize
and difficulty concentrating were the main symptoms reported. The symptom clusters, their pathophysiology and clinical outcomes.
prevalence was greatest in females, those from more socioeconomi- Further research is also needed to understand the health and social
cally deprived areas, people working in health and social care and impacts of these persistent symptoms, to support patients living
individuals living with health conditions and disabilities. Our analy- with long-term sequelae and to develop targeted treatments.
sis showed similar symptoms, including cognitive effects, as well as
similar risk factors; however, we were unable to assess the associa- Online content
tion between occupational status and reporting of symptoms due to Any methods, additional references, Nature Research report-
a lack of occupational data in UK primary care records. ing summaries, source data, extended data, supplementary infor-
Whittaker and colleagues undertook an analysis of 456,002 mation, acknowledgements, peer review information; details of
patients with COVID-19 in England using the Clinical Practice author contributions and competing interests; and statements of
Research Datalink (CPRD) Aurum database to determine the rates data and code availability are available at https://doi.org/10.1038/
of GP consultations for post-COVID-19 sequelae27. This analysis s41591-022-01909-w.
included both hospitalized and non-hospitalized patients and two
control groups consisting of patients without COVID-19 and those Received: 9 February 2022; Accepted: 21 June 2022;
with influenza before the pandemic. Patients with COVID-19 man- Published online: 25 July 2022
aged in the community were significantly more likely to consult for
loss of taste or smell and other symptoms such as joint pain, anxiety, References
depression, abdominal pain and diarrhea at ≥ 4 weeks after infec- 1. Ramos-Casals, M., Brito-Zerón, P. & Mariette, X. Systemic and organ-specific
immune-related manifestations of COVID-19. Nat. Rev. Rheumatol. 17,
tion compared to 12 months before infection. They also found that 315–332 (2021).
GP consultation rates for symptoms, prescriptions and healthcare 2. Office for National Statistics. The prevalence of long COVID symptoms and
use were mostly reduced in those who were managed in the commu- COVID-19 complications. https://www.ons.gov.uk/news/statementsandletters/
nity after the first COVID-19 vaccination dose; however, this study theprevalenceoflongcovidsymptomsandcovid19complications (2020).
investigated only 23 symptoms based on the NICE 2020 guidelines4 3. Ladds, E. et al. Persistent symptoms after COVID-19: qualitative study of 114
long COVID patients and draft quality principles for services. BMC Health
on managing the long-term effects of COVID-19, whereas in our Serv. Res. 20, 1–13 (2020).
study, we investigated 115 symptoms derived from a systematic 4. NICE. COVID-19 Rapid Guideline: Managing the Long-term Effects of
assessment of previous studies and discussions with patients with COVID-19. (NICE, 2020).
lived experience of long COVID and clinicians11. 5. Nalbandian, A. et al. Post-acute COVID-19 syndrome. Nat. Med. 27,
601–615 (2021).
We were unable to estimate the effect of vaccination and infec-
6. WHO. A clinical case definition of post-COVID-19 condition by a Delphi
tion year on long COVID symptoms in our study due to the very consensus. https://www.who.int/publications/i/item/WHO-2019-nCoV-Post_
short follow-up period among those vaccinated and infected in the COVID-19_condition-Clinical_case_definition-2021.1 (2021).
year 2022 (median 8 (IQR 4–14) and 12 (7–16) days, respectively) 7. Del Rio, C., Collins, L. F. & Malani, P. Long-term health consequences of
compared to those unvaccinated and infected in the year 2021 COVID-19. JAMA 324, 1723–1724 (2020).
8. Davis, H. E. et al. Characterizing long COVID in an international cohort: 7
(33 (16–77) and 64 (31–90) days, respectively). Furthermore, the months of symptoms and their impact. eClinicalMedicine 38, 101019 (2021).
majority (81%) of patients vaccinated before infection in our cohort 9. Groff, D. et al. Short-term and long-term rates of postacute sequelae of
were infected with SARS-CoV-2 within 2 weeks of vaccination, SARS-CoV-2 infection: a systematic review. JAMA Netw. Open 4,
which would be before acquiring immunity from vaccination, thus e2128568 (2021).
restricting the validity of our data to assess the effects of vaccination 10. Taquet, M. et al. Incidence, co-occurrence, and evolution of long-COVID
features: a 6-month retrospective cohort study of 273,618 survivors of
on long COVID. COVID-19. PLoS Med. 18, e1003773 (2021).
Further research is needed to estimate the prevalence of per- 11. Aiyegbusi, O. L. et al. Symptoms, complications and management of long
sistent symptoms associated with SARS-CoV-2 infection among COVID: a review. J. R. Soc. Med. 114, 428–442 (2021).
patients presenting to primary care. Much of the symptom data 12. Lopez-Leon, S. et al. More than 50 long-term effects of COVID-19: a
in primary care records is held in free-text entries rather than as systematic review and meta-analysis. Sci. Rep. 11, 1–12 (2021).
13. Michelen, M. et al. Characterising long COVID: a living systematic review.
clinically coded data. Natural language processing could be used to BMJ Glob. Heal. 6, e005427 (2021).
leverage these textual data to gain more accurate estimates of the 14. Huang, C. et al. 6-month consequences of COVID-19 in patients discharged
prevalence of these symptoms. from hospital: a cohort study. Lancet 397, 220–232 (2021).
The 50 consolidated symptoms that were found to be associ- 15. Al-Aly, Z., Xie, Y. & Bowe, B. High-dimensional characterization of
ated with SARS-CoV-2, 12 weeks after infection in our study, were post-acute sequelae of COVID-19. Nature 594, 259–264 (2021).
16. Sudre, C. H. et al. Attributes and predictors of long COVID. Nat. Med. 27,
clustered into three phenotypes with varying risk factors. Further 626–631 (2021).
research is needed to confirm the identified clusters using prospec- 17. Jacobs, L. G. et al. Persistence of symptoms and quality of life at 35 days after
tive and routinely recorded patient-reported symptom data. This hospitalization for COVID-19 infection. PLoS ONE https://doi.org/10.1371/
analysis would allow for assessment of whether clinical outcomes journal.pone.0243882 (2020).
and the underlying pathophysiology differ between these subgroups 18. Carvalho-Schneider, C. et al. Follow-up of adults with noncritical COVID-19
two months after symptom onset. Clin. Microbiol. Infect. 27, 258–263 (2021).
and potentially develop targeted therapies for the different pheno- 19. Galal, I. et al. Determinants of persistent post-COVID-19 symptoms: value of
typic subgroups. There is also a need to obtain patient-reported data a novel COVID-19 symptom score. Egypt. J. Bronchol. 15, 1–8 (2021).
on symptoms and assess the association between symptom burden, 20. Hughes, S. E. et al. Development and validation of the symptom burden
quality of life and work capability to ascertain which symptoms have questionnaire for long covid (SBQ-LC): Rasch analysis. BMJ https://www.bmj.
com/content/377/bmj-2022-070230 (2022).
the greatest impact on individuals. Finally, there is a need to under-
21. Price, S. J. et al. Is omission of free text records a possible source of data loss
stand the natural history of long COVID by assessing symptom bur- and bias in clinical practice research datalink studies? A case–control study.
den serially over time in a population-representative cohort with a BMJ Open 6, e011664 (2016).
history of COVID-19 alongside a matched control population. 22. Wood, A. et al. Linked electronic health records for research on a nationwide
Infection with SARS-CoV-2 is independently associated with cohort of more than 54 million people in England: data resource on behalf of
the CVD-COVID-UK consortium. BMJ https://doi.org/10.1136/bmj.n826
the reporting of 62 symptoms spanning multiple organ systems 12 (2021).
weeks or longer after infection. A wide range of both sociodemo- 23. NHS. 6. Results and Findings. https://digital.nhs.uk/data-and-information/
graphic and clinical factors are independently associated with the publications/statistical/coronavirus-as-recorded-in-primary-care/march-
development of persistent symptoms. Additional research is needed 2020-21/results-and-findings (2021).

Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine 1713


Articles NATURE MEDICInE
24. Office for National Statistics. Prevalence of Ongoing Symptoms Following Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
Coronavirus (COVID-19) Infection in the UK. h­tt­ps­:/­/w­ww­.o­ns­.g­ov­.u­k/­ published maps and institutional affiliations.
pe­op­le­po­pu­la­ti­on­an­dc­om­mu­ni­ty­/h­ea­lt­ha­nd­so­ci­al­ca­re­/c­on­di­ti­on­sa­nd­di­se­as­es­/
b­ul­le­ti­ns­/p­re­va­le­nc­eo­fo­ng­oi­ng­sy­mptomsfollowingcoronaviruscovid19infection Open Access This article is licensed under a Creative Commons
intheuk/4june2021 (2021). Attribution 4.0 International License, which permits use, sharing, adap-
25. Blomberg, B. et al. Long COVID in a prospective cohort of home-isolated tation, distribution and reproduction in any medium or format, as long
patients. Nat. Med. 27, 1607–1613 (2021). as you give appropriate credit to the original author(s) and the source, provide a link to
26. Office for National Statistics. Prevalence of Ongoing Symptoms Following the Creative Commons license, and indicate if changes were made. The images or other
Coronavirus (COVID-19) Infection in the UK. h­tt­ps­:/­/w­ww­.o­ns­.g­ov­.u­k/­ third party material in this article are included in the article’s Creative Commons license,
pe­op­le­po­pu­la­ti­on­an­dc­om­mu­ni­ty­/h­ea­lt­ha­nd­so­ci­al­ca­re­/c­on­di­ti­on­sa­nd­di­se­as­es­/ unless indicated otherwise in a credit line to the material. If material is not included in
b­ul­le­ti­ns­/p­revalenceofongoingsymptomsfollowingcoronaviruscovid19infection the article’s Creative Commons license and your intended use is not permitted by statu-
intheuk/7april2022 (2022). tory regulation or exceeds the permitted use, you will need to obtain permission directly
27. Whittaker, H. R. et al. GP consultation rates for sequelae after acute from the copyright holder. To view a copy of this license, visit http://creativecommons.
COVID-19 in patients managed in the community or hospital in the UK: org/licenses/by/4.0/.
population-based study. Brit. Med. J. 375, e065834 (2021). © The Author(s) 2022

1714 Nature Medicine | VOL 28 | August 2022 | 1706–1714 | www.nature.com/naturemedicine


NATURE MEDICInE Articles
Methods index date (at baseline). These variables were used to generate propensity scores for
Study design and setting. This analysis was undertaken as part of the National symptom burden to ensure that pre-existing health conditions and symptoms did
Institute for Health and Care Research (NIHR) and UK Research and Innovation not differ between the cohort of patients with and without recorded evidence of
(UKRI)-funded Therapies for Long COVID in non-hospitalized individuals SARS-CoV-2 infection.
(TLC) study28. We conducted a population-based retrospective matched-cohort BMI was categorized as underweight (<18.5 kg m−2), healthy weight (18.5–
study between 31 January 2020 and 15 April 2021 using data from the Medicines 24 kg m−2), overweight (25–29 kg m−2) and obese (≥30 kg m−2). Smoking status was
and Healthcare products Regulatory Agency (MHRA) CPRD Aurum. CPRD categorized as never smoked, ex-smoker and current smoker. Ethnic group was
Aurum is an anonymized database of primary care medical records of over 7 categorized as either white, Asian (origin from India, Pakistan, China, Cambodia,
million actively registered patients in general practices that use the EMIS clinical Thailand, Vietnam, Malaysia, Sri Lanka, Nepal, Bangladesh, Japan or Taiwan),
information system29. It captures data on patient demographics, diagnoses, Black Afro-Caribbean, mixed or other ethnic group (native American, Middle
symptoms, prescriptions, referrals and tests. Structured data on diagnoses, Eastern and Polynesian origin). Missing data on ethnic group, socioeconomic
symptoms and referrals are recorded using SNOMED CT coding terminology. status, BMI and smoking status were denoted by a ‘missing’ category within the
Selection of SNOMED CT codes for data extraction was conducted by a team of corresponding variable.
clinical researchers using an inhouse developed software platform called Code
Builder, with systematic searching of existing code lists, reference to the SNOMED Statistical analysis. Continuous variables were summarized as mean and s.d. and
CT terminology browser and through clinical knowledge and discussion. Data categorical variables as frequencies and percentages. A series of Cox proportional
extraction was performed using the data extraction for epidemiological research hazards regression models were used to provide aHRs for each of the individual
(DExTER) tool for automated clinical epidemiological studies30. symptoms among patients with SARS-CoV-2 infection compared to patients
with no recorded evidence of SARS-CoV-2 infection separately during the first 4,
Participants. Patients aged 18 years and older with a minimum registration period 4–12 and 12 weeks after the index date, with follow-up initiating from the index
of 12 months were included in the study. Practices were considered eligible 12 date, 4 weeks after the index date and 12 weeks after the index date, respectively.
months after they were deemed to be providing research quality data. The cohort Patients with a minimum follow-up period of 4 and 12 weeks were included in
of patients with SARS-CoV-2 infection was defined as patients with a coded record the symptom outcome analyses at 4–12 and 12 weeks, respectively. Adjustments
of a positive RT–PCR test or antigen test result for SARS-CoV-2 and without were made for age, sex, ethnic group, socioeconomic status, BMI, smoking status
a record of hospitalization 14 d before or 42 d after infection (within 28 d of and the specified symptom recorded at baseline between 3 and 12 months before
infection with a ± 14-d grace period for clinical coding delays) in the primary care the index date. Multiple testing was accounted for by incorporating a Bonferroni
record. Their index date was assigned as the date of confirmation of SARS-CoV-2 correction to adjust the P value thresholds for statistical significance. Symptoms
infection. SNOMED CT codes for defining COVID-19 are listed in Supplementary with statistically significant aHRs after Bonferroni correction in the period 12
Table 12a,b. For each patient infected with SARS-CoV-2, a pool of patients weeks after the index date were presented in a forest plot. A post hoc-subgroup
without a record of suspected or confirmed COVID-19 were selected from the analysis was performed in a cohort of patients who were infected before and
database. These patients were assigned the same index date as the index date of the after 31 August 2020 (first and second surge of the pandemic) and propensity
corresponding patient infected with SARS-CoV-2 to mitigate immortal time bias31. score-matched patients within the same sub-study period.
In a cohort restricted to patients with a positive RT–PCR or antigen test result
Propensity score matching. To control for confounding, each patient infected for SARS-CoV-2 and a minimum of 12 weeks follow-up, unadjusted and adjusted
with SARS-CoV-2 was propensity score-matched with up to four patients with Cox proportional hazards models were used to assess the association between the
no recorded evidence of SARS-CoV-2 infection using a logistic regression model risk factors described in the covariates section and the primary (at least one of the
including the covariates listed in the covariates section below and a caliper width symptoms in the WHO case definition for long COVID) and secondary (at least
of 0.2. The SMD between patients infected with SARS-CoV-2 and patients with one of the symptoms statistically associated with SARS-CoV-2 infection) outcome
no recorded evidence of infection was reported for each variable before and after definitions of long COVID. The median follow-up period and IQR were reported
matching, and a variable with SMD > 0.1 after matching was considered to indicate for patients within each risk factor strata. Hazard ratios were obtained by taking
imbalance in baseline characteristics. Kernel density plots were drawn for the two exponentiated coefficients from the Cox proportional hazards models, and we
groups before and after matching to check the distribution of propensity scores. considered covariates with a P value <0.05 to be statistically significant.
A post hoc latent class analysis was performed on the 50 consolidated
Outcomes and follow-up. We identified 115 relevant symptoms coded within symptoms, and the model with the elbow point of fit for the Bayesian Information
primary care records (Supplementary Table 13) through a systematic review Criteria was considered optimal33,34. A multinomial logistic regression model was
and meta-analysis of long COVID symptoms11, a scoping search of long COVID performed to identify the demographic features associated with each of the latent
clinical assessment questionnaires, qualitative interviews with patients, a clinician long COVID classes compared to patients without long COVID. All analyses were
survey and refinement of the symptom list using psychometric methods32. These performed in Stata IC v.16 or R v.4.0.4.
were grouped into 15 domains: (1) breathing, (2) pain, (3) circulation, (4) fatigue,
(5) cognitive health, (6) movement, (7) sleep, (8) ear, nose and throat, (9) stomach Ethical approval. CPRD obtains annual research ethics approval from the UK’s
and digestion, (10) muscles and joints, (11) mental health, (12) hair, skin and nails, Health Research Authority Research Ethics Committee (East Midlands, Derby;
(13) eyes, (14) reproductive health and (15) other symptoms. SNOMED CT code reference no. 05/MRE04/87) to receive and supply patient data for public health
lists for the symptoms are published on GitHub (https://github.com/AnuSub/ research. Therefore, no additional ethics approval is required for observational
LongCOVID_Symptoms_CodeList). studies using CPRD Aurum data for public health research, subject to individual
Our primary outcome definition of long COVID was pre-defined as the research protocols meeting CPRD data governance requirements. The use of CPRD
presence of at least one symptom included in the WHO case definition at ≥12 Aurum data for the study was approved by the CPRD Independent Scientific
weeks after infection (Supplementary Table 13)6. Our secondary outcome Advisory Committee (reference no. 21_000423).
definition of long COVID was derived post hoc as the presence of at least one
symptom that was statistically associated with SARS-CoV-2 infection at ≥12 weeks Reporting summary. Further information on research design is available in the
after infection within this study (Supplementary Table 13). Nature Research Reporting Summary linked to this article.
The 115 symptoms were consolidated into 50 distinct symptoms to be included
as categorical indicator variables for latent class analysis. This was carried out
to avoid producing clusters of (1) commonly occurring symptoms that are not Data availability
associated with COVID-19, (2) symptoms with mutually inclusive SNOMED CT Access to anonymized patient data from CPRD is subject to a data sharing agreement
codes (such as pain and chest pain) and (3) symptoms that commonly co-appear containing detailed terms and conditions of use following protocol approval from
(such as nausea and vomiting). the MHRA Independent Scientific Advisory Committee. This study-specific
Patients were followed up from the index date until the earliest of the following analyzable dataset is therefore not publicly available but can be requested from
end points (patient exit date): (1) recording of symptoms of interest within the time the corresponding author at [email protected] subject to research data
interval studied, (2) death, (3) transfer out of practice, (4) end of general practice governance approvals. Details about Independent Scientific Advisory Committee
data and (5) study end date (15 April 2021). The follow-up period was split into applications and data costs are available on the CPRD website (cprd.com).
three time periods from the index date: (1) the first 4 weeks (‘acute COVID-19’
among the cases), (2) 4–12 weeks (‘ongoing symptomatic COVID-19’) and (3) after Code availability
12 weeks (period of ‘post-COVID-19 condition’ or ‘long COVID’), in accordance Stata and R codes are available at https://github.com/AnuSub/Stata-and-R-codes
with the current NICE guidelines on managing the long-term effects of COVID-194.

Covariates. We extracted data on demographic characteristics (age, sex, ethnic References


group, socioeconomic status and IMD), index week, BMI, smoking status and 28. Haroon, S. et al. Therapies for long COVID in non-hospitalised individuals:
87 chronic health conditions (Supplementary Table 1). We extracted data on 115 from symptoms, patient-reported outcomes and immunology to targeted
symptoms recorded in the period between 12 months and 3 months before the therapies (the TLC study). BMJ Open 12, e060413 (2022).

Nature Medicine | www.nature.com/naturemedicine


Articles NATURE MEDICInE
29. Wolf, A. et al. Data resource profile: Clinical Practice Research Datalink Transplant and Cellular Therapeutics at the University of Birmingham. S.E.H. declares
(CPRD) Aurum. Int. J. Epidemiol. 48, 1740 (2019). personal fees from Cochlear and Aparito outside the submitted work. O.L.A. receives
30. Gokhale, K. M. et al. Data extraction for epidemiological research (DExtER): funding from the NIHR Birmingham BRC, NIHR ARC, West Midlands, NIHR BTRU
a novel tool for automated clinical epidemiology studies. Eur. J. Epidemiol. 36, in Precision Transplant and Cellular Therapeutics at the University of Birmingham and
165–178 (2021). University Hospitals Birmingham NHS Foundation, Innovate UK, Gilead Sciences,
31. Yadav, K. & Lewis, R. J. Immortal time bias in observational studies. JAMA Janssen Pharmaceuticals and Sarcoma UK. O.L.A. declares personal fees from Gilead
325, 686–687 (2021). Sciences, GSK and Merck outside the submitted work. C.M. receives funding from NIHR
32. Hughes, S. E. et al. Development and validation of the symptom burden SRMRC, NIHR BTRU in Precision Transplant and Cellular Therapeutics and Innovate
questionnaire for long COVID (SBQ-LC): Rasch analysis. BMJ https://doi. UK and has received personal fees from Aparito outside the submitted work. A.D.S. is
org/10.1136/bmj-2022-070230 (2022). supported by a postdoctoral fellowship from THIS Institute, NIHR University College
33. Nylund, K. L., Asparouhov, T. & Muthén, B. O. Deciding on the number of London Hospitals BRC, grants from NIHR and British Heart Foundation Accelerator
classes in latent class analysis and growth mixture modeling: a Monte Carlo Award. E.S. has received grants from the Wellcome Trust, MRC, NIHR EME, NIHR
simulation study. Struct. Equ. Modeling https://doi. HTA, HDR-UK, BLF, EPSRC and Alpha 1 Foundation in the last 36 months. She has
org/10.1080/10705510701575396 (2007). been an honorarium for lectures about COVID-19 treatments, which are run by GSK,
34. Weller, B. E., Bowen, N. K. & Faubert, S. J. Latent class analysis: a guide to attended a virtual conference at the European Respiratory Society in 2020 that was
best practice. J. Black Psychol. https://doi.org/10.1177/0095798420930932 funded by AstraZeneca and participated in an advisory board for COPD, which is
(2020). run by Boehringer Ingelheim. S.M. has received funding from NIHR (RfPB, PGfAR,
HTA and EME streams), UKRI, ESRC and the Midlands Engine. He has attended
educational events funded by Psychiatric Genetic Testing, Janssen and Lundbeck in the
Acknowledgements
last 5 years. P.M., T.W., C.I. and E.L. are employees of CPRD, the data custodians for
We thank the funders of our TLC study (COV-LT-0013), NIHR and UKRI, all the
CPRD Aurum. CPRD is jointly sponsored by the UK Government’s MHRA and NIHR.
patients on the TLC Lived Experience Advisory Group and N. Mangat for supporting
As a not-for-profit UK Government body, CPRD seeks to recoup the cost of delivering
LEAP member recruitment. We also thank A. Walker, K. Jones and Y. Lee for providing
its research services to academic, industry and government researchers through
administrative support for the study.
research user license fees. J.C. receives funding from NIHR on PPI from a study at UCL
(NIHR132914) and a study at University Hospitals Bristol (NIHR203304). J.C. is a lay
Author contributions member on the NICE COVID expert panel and a citizen partner on the COVID END
S.H., K.N. and M.C. conceived the research question and idea for the study. S.H., K.N., Evidence Synthesis Global Horizon Scanning panel. J.C. declares personal fees from
A. Subramanian, P.M. and M.C. agreed the study methods. P.M. and T.W. facilitated MEDABLE, GlaxoSmithKline and Roche Canada outside of submitted work. K.L.M.
data acquisition. K.G. supported data extraction and data management. S.E.H., M.C., is a trustee and volunteer at long COVID SOS. K.L.M. is on the long COVID Advisory
S.H., O.L.A., C.M. and G.T. informed the selection of symptoms. T.T. provided statistical Board for Dysautonomia International and is employed by NIHR. T.M. receives funding
advice. A. Subramanian, K.B., N.S.W., M.S., F.K., K.O., R.H., N.B., N.C., S.L., G.T. and from NIHR ARC, West Midlands. G.V.G. receives funding from the NIHR Birmingham
S.H. developed SNOMED CT code lists for data extraction. A. Shah provided advice on ECMC, NIHR Birmingham SRMRC, Nanocommons H2020-EU (731032), MAESTRIA
study design and checked code lists. A. Subramanian performed the statistical analysis (grant agreement no. 965286) and the MRC Health Data Research UK (HDRUK/
with input from S.H. and K.N. A. Subramanian and S.H. drafted the manuscript with CFC/01). J.M.L. receives funding from the MRC, Versus Arthritis, NIHR, FOREUM,
input from all co-authors. K.B. supported the use of the RECORD checklist for drafting UKSPINE and the Scar Free Foundation and declares personal fees from Bayer. F.K. is
the manuscript. K.L.M. and J.C. provided patient and public involvement for the study. supported by an NIHR Doctoral Fellowship award (grant no. 300688). M.J.P. is supported
All co-authors reviewed and approved the final draft of the manuscript. by NIHR BRC. All other co-authors declare no competing interests. The views expressed
are those of the investigators, and the funders had no role in study design, data collection
Competing interests and analysis, decision to publish or preparation of the manuscript.
M.C. is Director of the Birmingham Health Partners Center for Regulatory Science and
Innovation and Director of the Center for Patient-Reported Outcomes Research and
is an NIHR Senior Investigator. M.C. receives funding from the NIHR Birmingham Additional information
Biomedical Research Center (BRC), NIHR Surgical Reconstruction and Microbiology Extended data is available for this paper at https://doi.org/10.1038/s41591-022-01909-w.
Research Center (SRMRC), NIHR Birmingham-Oxford Blood and Transplant Research Supplementary information The online version contains supplementary material
Unit (BTRU) in Precision Transplant and Cellular Therapeutics and NIHR Applied available at https://doi.org/10.1038/s41591-022-01909-w.
Research Collaboration (ARC) West Midlands at the University of Birmingham and Correspondence and requests for materials should be addressed to
University Hospitals Birmingham NHS Foundation Trust, Health Data Research UK, Krishnarajah Nirantharakumar.
Innovate UK (part of UK Research and Innovation), Macmillan Cancer Support, SPINE
UK, UKRI, UCB Pharma, Janssen, GSK and Gilead. M.C. has received personal fees Peer review information Nature Medicine thanks Judith Bruchfeld and the other,
from Astellas, Aparito, CIS Oncology, Takeda, Merck, Daiichi Sankyo, Glaukos, GSK and anonymous, reviewer(s) for their contribution to the peer review of this work. Primary
the Patient-Centered Outcomes Research Institute outside the submitted work. S.E.H. Handling editor: Jennifer Sargent, in collaboration with the Nature Medicine team.
receives funding from NIHR ARC, West Midlands and the NIHR BTRU in Precision Reprints and permissions information is available at www.nature.com/reprints.

Nature Medicine | www.nature.com/naturemedicine


NATURE MEDICInE Articles

Extended Data Fig. 1 | Kernel density plot of propensity scores of patients infected with SARS CoV-2 and comparator cohort of patients with no recorded
evidence of SARS CoV-2 infection, before and after propensity score matching.

Nature Medicine | www.nature.com/naturemedicine


Articles NATURE MEDICInE

Extended Data Fig. 2 | Symptoms associated with SARS CoV-2 ≥ 12 weeks post-infection before and after 31 August 2020 (first and second surges of the
pandemic in the UK) – Part A.

Nature Medicine | www.nature.com/naturemedicine


NATURE MEDICInE Articles

Extended Data Fig. 3 | Symptoms associated with SARS CoV-2 ≥ 12 weeks post-infection before and after 31 August 2020 (first and second surges of the
pandemic in the UK) – Part B.

Nature Medicine | www.nature.com/naturemedicine


Articles NATURE MEDICInE

Extended Data Fig. 4 | Symptoms associated with SARS CoV-2 ≥ 12 weeks post-infection before and after 31 August 2020 (first and second surges of the
pandemic in the UK) – Part C.

Nature Medicine | www.nature.com/naturemedicine


NATURE MEDICInE Articles

Extended Data Fig. 5 | Elbow plot to determine the optimal number of classes.

Nature Medicine | www.nature.com/naturemedicine


Articles NATURE MEDICInE

Extended Data Fig. 6 | Symptom clusters among patients with long COVID from latent class analysis.

Nature Medicine | www.nature.com/naturemedicine



You might also like