Annrheumdis 2020 218636.full
Annrheumdis 2020 218636.full
Annrheumdis 2020 218636.full
Sandling, Johanna K.; Pucholt, Pascal; Hultin Rosenberg, Lina; Farias, Fabiana H.G.;
Kozyrev, Sergey V.; Eloranta, Maija Leena; Alexsson, Andrei; Bianchi, Matteo; Padyukov,
Leonid; Bengtsson, Christine; Jonsson, Roland; Omdal, Roald; Lie, Benedicte A.; Massarenti,
Laura; Steffensen, Rudi; Jakobsen, Marianne A.; Lillevang, Søren T.; Lerang, Karoline;
Molberg, Øyvind; Voss, Anne; Troldborg, Anne; Jacobsen, Søren; Syvänen, Ann Christine;
Jönsen, Andreas; Gunnarsson, Iva; Svenungsson, Elisabet; Rantapää-Dahlqvist, Solbritt;
Bengtsson, Anders A.; Sjöwall, Christopher; Leonard, Dag; Lindblad-Toh, Kerstin; Rönnblom,
Lars
Published in:
Annals of the Rheumatic Diseases
DOI:
10.1136/annrheumdis-2020-218636
Publication date:
2021
Document version:
Final published version
Document license:
CC BY
Terms of use
This work is brought to you by the University of Southern Denmark.
Unless otherwise specified it has been shared according to the terms for self-archiving.
If no other license is stated, these terms apply:
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
TRANSLATIONAL SCIENCE
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
behind these abnormalities are both environmental and genetic, Genetic association analyses
and today around 100 SLE susceptibility loci have been iden- Several variant sets were generated for aggregate association
tified.3 4 Monogenic forms of SLE exist, but for a majority of testing: (1) 1832 individual gene variant sets; (2) 35 pathway
patients the environment and the cumulative number of suscep- variant sets based on the Kyoto Encyclopaedia of Genes and
tibility alleles will influence the risk of developing the disease.4 5 Genomes (KEGG)20 ; (3) five literature review gene sets: the
To date, the contribution of rare genetic variants and the type I interferon pathway,21 interferonopathy genes,22 23 SLE
impact of regulatory variants have not been widely explored in Genome-Wide Association Study (GWAS) genes,3 4 the comple-
SLE. DNA sequencing has the potential to discover novel SLE ment subset of KEGG hsa04610 and genes causing monogenic
associated variants not captured by genotyping arrays. Due to SLE or lupus-like disease.24 Aggregate association testing was
the high cost, whole genome sequencing studies (WGS) in SLE performed using Sequence Kernel Association Optimal Test
have so far mainly focused on families or smaller samples, as (SKAT-O) or GenePy.25 26 Single variant association analyses were
have exome sequencing studies (WES).6–9 Today it is feasible performed in PLINK. SLE case-only variants were identified by
to perform targeted sequencing in larger cohorts; however, the removing all SNVs present in our Swedish control dataset, the
number of such studies focusing on SLE is still limited.10 Addi- SweGen project or the Genome Aggregation Database European
tionally, association analysis for rare variants discovered through non-Finnish controls.27 28
sequencing is hampered by low statistical power. Aggregating
variants on the gene level or by molecular pathway information Risk scores and cluster analysis
is one approach to increase power and gain biological insight Cumulative pathway SLE polygenic risk scores (pathway PRSs)
from rare variants.11 were assigned to each individual based on SNVs associated with
The clinical heterogeneity in SLE is likely due to an underlying SLE at nominal significance. For each independent SNV the
molecular diversity that could have implications for therapy. natural logarithm of the OR for SLE susceptibility was multi-
In recent years this has started to be addressed, mainly using plied by the number of minor alleles in each individual. The sum
gene expression, autoantibody profiles and cytokines to iden- of all products of all genes in each of the 35 KEGG pathways for
tify groups of patients with SLE with distinct molecular disease each patient was defined as the individual pathway PRS. Hier-
mechanisms.12–14 Using genetic information to stratify patients archical cluster analysis of pathway PRSs was used to identify
would have the advantage of providing stable molecular markers groups of patients with SLE.
for early classification.
Here, we performed targeted sequencing of regulatory and
Replication study and meta-analysis
coding regions in a Swedish SLE case–control cohort. We aimed
Replication genotyping in individuals from Norway and Denmark
at elucidating the genetic aetiology of SLE from the immunity
was performed using the MassARRAY system. The Swedish SLE
pathway level to the single variant level, and stratify patients
case–control study was expanded to include an additional 1000
with SLE into molecular subgroups. Altogether around 9% of
control individuals.27 The Scandinavian meta-analysis included
all genes in the human genome were analysed based on their
1794 patients with SLE and 3241 control individuals.
role in immune-mediated diseases. Gene regions were extended
to include promoters and other potentially regulatory elements
based on mammalian conservation.15 RESULTS
We performed a DNA sequencing study in SLE to study immu-
nity pathways, an overview of analyses can be found in online
METHODS supplemental figure S2.
For full details on methods see online supplemental methods.
T lymphocyte differentiation and innate immunity pathways
Subjects and DNA samples are associated with SLE
The Swedish SLE cohorts included patients recruited at five rheu- The sequencing data analysis focused on 1832 genes with rele-
matology clinics and the controls were healthy blood donors and vance for immune-mediated diseases. These genes mainly belong
population controls. The quality-controlled dataset comprised to 35 molecular signalling pathways as defined by the KEGG
958 patients with SLE and 1026 control individuals. Patients database (online supplemental table S2).20 Using an aggregate
with SLE fulfilled at least four of the classification criteria for test for all variants in the genes belonging to each pathway, we
SLE as defined by the American College of Rheumatology found that 21 of the tested pathways were associated with SLE
(ACR).16 17 Clinical characteristics of the patients are available in (false discovery rate (FDR) <0.05, table 1 and online supple-
online supplemental tables S1A and B. mental table S3). The most significantly associated pathways
included T helper cell differentiation pathways, with Th1 and
Th2 cell differentiation as the top result (FDRTh1-2=2.2×10-9;
Targeted DNA sequencing analysis FDRTh17=1.5×10-8), followed by antigen processing and presen-
Targeted DNA sequencing was performed in the Swedish SLE tation (FDR=3.1×10-9).
case–control cohorts. A SeqCap EZ Choice XL sequence capture We next explored a sequential elimination strategy to iden-
panel was designed, libraries were prepared as described else- tify independent pathway associations. First, removing all Th1
where18 and sequenced on an Illumina HiSeq 2500. An overview and Th2 pathway genes in the pathway aggregate association
of the variant discovery and quality control steps can be found test resulted in the antigen processing and presentation pathway
in online supplemental figure S1. Study subjects falling outside as the top result (FDR=4.8×10-6). Second, antigen processing
of the European subpopulation of the Human Genome Diver- and presentation as well as Th1 and Th2 pathway genes were
sity Project (HGDP) reference set were excluded (online supple- removed, which resulted in Complement and coagulation
mental figure S7).19 The quality- controlled dataset contained cascades as the top result (FDR=0.0091). Third, also genes in this
287 354 single-nucleotide variants (SNVs) and covered 1832 of pathway were removed, and the janus kinase-signal transducers
the targeted gene regions. and activators of transcription (JAK-STAT) pathway became the
2 Sandling JK, et al. Ann Rheum Dis 2020;0:1–9. doi:10.1136/annrheumdis-2020-218636
Systemic lupus erythematosus
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
Table 1 SLE case–control pathway based aggregate association analysis
Pathway Genes in pathway Genes in test SNVs in test P value* FDR†
Th1 and Th2 cell differentiation (hsa04658) 92 78 14 362 6.3E-11 2.2E-09
Antigen processing and presentation (hsa04612) 77 40 8017 1.8E-10 3.1E-09
Hematopoietic cell lineage (hsa04640) 97 71 13 013 3.8E-10 4.5E-09
Th17 cell differentiation (hsa04659) 107 96 19 347 1.7E-09 1.5E-08
Intestinal immune network for IgA production (hsa04672) 49 39 7909 3.4E-08 2.4E-07
Natural killer cell-mediated cytotoxicity (hsa04650) 131 100 15 821 4.7E-06 2.8E-05
TNF signalling pathway (hsa04668) 112 88 12 639 1.9E-05 9.4E-05
JAK-STAT signalling pathway (hsa04630) 162 133 18 003 7.4E-05 0.00032
RIG-I-like receptor signalling pathway (hsa04622) 70 63 8459 0.00021 0.00080
NOD-like receptor signalling pathway (hsa04621) 178 109 15 729 0.00031 0.0011
Complement and coagulation cascades (hsa04610) 79 50 7112 0.00041 0.0013
Toll-like receptor signalling pathway (hsa04620) 104 96 12 178 0.00080 0.0022
Cytokine-cytokine receptor interaction (hsa04060) 294 221 26 771 0.00083 0.0022
C-type lectin receptor signalling pathway (hsa04625) 104 75 12 986 0.0020 0.0050
IL-17 signalling pathway (hsa04657) 93 68 9358 0.0043 0.0100
Fc epsilon RI signalling pathway (hsa04664) 68 51 8514 0.0052 0.011
Viral protein interaction with cytokine and receptor (hsa04061) 100 75 8435 0.0062 0.013
NF-kappa B signalling pathway (hsa04064) 102 88 14 349 0.0078 0.015
Osteoclast differentiation (hsa04380) 128 101 18 602 0.013 0.023
T cell receptor signalling pathway (hsa04660) 103 85 14 268 0.014 0.025
Cytosolic DNA-sensing pathway (hsa04623) 63 40 4993 0.015 0.025
Pathways with FDR <0.05 in the association analysis including all genes are presented.
*SKAT-O SLE case-control association p value.
†SKAT-O SLE case–control association FDR.
FDR, false discovery rate; IL-17, interleukin 17; NF, nuclear factor; NOD, nucleotide-binding oligomerisation domain; RIG, retinoic acid-inducible gene; SKAT-O, sequence kernel
association optimal test; SLE, systemic lupus erythematosus; SNV, single-nucleotide polymorphism; TNF, tumour necrosis factor.
top result (FDR=0.014). Lastly, when removing genes in all for the pathway PRS for seven pathways (figure 2B). As we had
these four pathways no significant pathways remained. Thus, previously observed that a high SLE genetic risk score was asso-
our data point to two main routes with genetic evidence of asso- ciated with organ damage in SLE, we investigated whether this
ciation to SLE: T cell differentiation and innate immunity. could be observed for specific pathways.5 We found that the SLE
To identify the genes that underlie the association signals in International Collaborating Clinics Damage Index was signifi-
the T-cell differentiation, antigen processing and presentation, cantly higher in the SLE patients positive for the T cell or B cell
Complement and coagulation and JAK-STAT pathways, gene- receptor signalling pathways (figure 3A,B). No other pathways
based association testing was performed (figure 1). The top were associated with clinical manifestations of SLE or survival.
association for the JAK-STAT pathway originated from the IFN We then performed a hierarchical cluster analysis on the
kappa (IFNK) gene region. SLE-associated genes in the T cell pathway PRSs in SLE, to identify groups of patients with similar
differentiation and antigen processing and presentation path- molecular aetiology. Four clusters of patients were identified
ways were dominated by genes in the HLA region, and for the (figure 4). The pathway with the most significant difference in
complement and coagulation cascade pathway, complement PRS between clusters was the antigen processing and presen-
genes located in the HLA region were highly significantly asso- tation pathway, followed by Th17 cell differentiation (online
ciated with SLE. supplemental figure S4). Next, we investigated whether the
molecular stratification of patients with SLE also mirrored differ-
Pathway PRS define subsets of patients with SLE ences in clinical presentation between groups. We found that the
Having identified pathways with genetic association with SLE, presence of autoantibodies against Sjögren’s syndrome-related
we hypothesised that different patients with SLE could have antigens SSA and/or SSB was more common among patients in
distinct pathways affected. We constructed pathway PRS for clusters 3 and 4 (figure 3C). We did not observe any significant
each individual and each of the pathways, by combining the difference in other clinical features, including survival, between
burden of common SLE associated alleles from our sequencing the four patient clusters.
data. Individuals with a pathway PRS higher than that observed
for the 97.5th percentile of control individuals were classified Common variants contribute risk at monogenic risk loci in SLE
as positive for that pathway (online supplemental figure S3). We then focused our analysis on gene-sets with prior evidence
The largest proportion of positive SLE patients was observed for involvement in SLE, but which were not defined in KEGG,
for the Cytokine-cytokine receptor interaction pathway (41%, to investigate the impact of both rare and common variants for
figure 2A, and online supplemental table S4). For the Th1 and these groups of genes. We found that interferon system, inter-
Th2 cell differentiation, antigen processing and presentation, feronopathy, SLE GWAS, complement system and monogenic
Complement and coagulation cascades and JAK-STAT signalling SLE and lupus-like disease genes in aggregate were associated
pathways 18%, 16%, 21% and 29% of patients with SLE were with SLE when analysing variants of all minor allele frequen-
positive, respectively. On average each SLE patient tested positive cies (MAF) (table 2 and online supplemental table S5). Only the
Sandling JK, et al. Ann Rheum Dis 2020;0:1–9. doi:10.1136/annrheumdis-2020-218636 3
Systemic lupus erythematosus
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
Figure 1 Results of SLE case–control gene-based association analyses. P values for association plotted against chromosomal location, where
each point represents a gene region. The line indicates a false discovery rate of 5%. The y-axis has been cut at p=1×10-15. Genes belonging to
the T-cell differentiation (Th1 and Th2), antigen processing and presentation, complement and coagulation or JAK-STAT signalling pathways are
highlighted, and their most significant genes or gene regions are indicated by name. IFNK, interferon kappa; IL21, interleukin 21; SLE, systemic lupus
erythematosus.
monogenic SLE and lupus-like disease gene-set was significantly we observed SNV associations at three potentially novel SLE risk
associated with SLE when separately analysing the rarer variant loci, CAPN13, MOB3B/IFNK and HAL, at a suggestive signifi-
(MAF <0.01) contribution (table 2). There was a clear common cance threshold (p<1×10-4, online supplemental figure 5B–E,
variant (MAF >0.05) contribution to associations for the inter- table S8). As the association signals at CAPN13, MOB3B/IFNK
feronopathy, SLE GWAS, complement system and monogenic and HAL had not been reported in SLE GWAS in other ances-
SLE and lupus-like disease gene-sets (table 2). tries, we attempted to replicate these findings in additional
Scandinavian SLE cases and controls (online supplemental table
Potentially novel SLE risk loci S1A). However, we did not find additional support for a role of
Next, we asked whether we could detect novel SLE risk loci, SNVs at these novel loci in SLE (online supplemental table S9).
regardless of pathway or gene-set membership. Two potentially
novel gene regions passed a Bonferroni corrected threshold in Patients with SLE carry unique coding variants
the gene-based SLE case–control association analyses: PABPC4 We next investigated whether there was an increased rare coding
(p=4.3×10-8) and IFNK (p=1.2×10-5, online supplemental mutational burden for patients with SLE at the 1832 genes.
figure 5A, tables S6 and S7). In single variant association analyses, We observed that all individuals carried rare non-synonymous
Figure 2 Pathway SLE polygenic risk scores. (A) Illustrates pathway Polygenic Risk Scores (PRS) for the Cytokine–cytokine receptor interaction
pathway. P values represent differences in PRS between patients with SLE (SLE) and healthy control individuals (HC). The dashed line indicates the
PRS 97.5 percentile in control individuals. (B) The number of pathways each individual patient with SLE tested positive for using the pathway PRS. On
average patients were positive for 7.2 pathways. SLE, systemic lupus erythematosus.
4 Sandling JK, et al. Ann Rheum Dis 2020;0:1–9. doi:10.1136/annrheumdis-2020-218636
Systemic lupus erythematosus
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
Figure 3 Pathway SLE polygenic risk scores grouping and clustering. (A, B) The Systemic Lupus Erythematosus International Collaborating Clinics
(SLICC) damage index for patients with SLE positive and negative for the T cell receptor and B cell receptor signalling pathways. P values represent
differences in Damage Index between pathway positive and negative patients, uncorrected p values are presented (Bonferroni corrected threshold
p=0.00143). (C) Prevalence of Sjögren’s syndrome (SSA and/or SSB) autoantibodies in SLE patients in the four clusters. P value represent difference in
SSA/SSB autoantibody status between clusters of SLE patients, uncorrected p value is presented (Bonferroni corrected threshold p=0.002).
variants, with an average number of around 32 variants per indi- mucin in mucus (table 3). Five patients with SLE carried the
vidual for both patients with SLE and control individuals (online same deleterious MUC5B missense mutation (rs773068050,
supplemental figure S6). None of the patients with SLE were p.Thr2724Pro). MUC5B gene variants have previously been asso-
homozygous carriers of rare non-synonymous alleles in genes ciated with interstitial lung disease (ILD), a condition affecting
for monogenic SLE and lupus-like diseases (online supplemental around 3% of Swedish patients with SLE.29–31 However, there
table S10). Next, we hypothesised that protein coding variants was no evidence of ILD in these five patients, but two of them
observed exclusively in patients with SLE could be causal candi- had suffered from pleuritis (online supplemental table S12). In
dates. A total of 1475 case-only nonsynonymous variants were conclusion, we did not find evidence for SLE patients carrying a
identified in the 958 patients with SLE (online supplemental generally increased burden of rare coding variants at these genes.
table S11). These were variants that were observed in at least However, our analysis identified a number of coding variants
one patient with SLE, but not in control individuals of similar observed exclusively in patients with SLE. This catalogue of vari-
ancestry.27 28 The most frequent of these SNVs was found in the ants could serve as a resource for future studies investigating the
MUC5B gene which encodes mucin 5B, the major gel-forming role of case-only SNVs in SLE.
Figure 4 Clustering of patients with SLE based on pathway Polygenic Risk Scores (PRS). Heat map with pathways on the x-axis (KEGG IDs) and
individuals on the y-axis based on normalised PRS. Hierarchical cluster analysis was performed based on the PRS per pathway for each individual. The
colour bar on the left indicates the four main clusters of individuals identified. KEGG, Kyoto Encyclopaedia of Genes and Genomes; SLE, systemic lupus
erythematosus.
Sandling JK, et al. Ann Rheum Dis 2020;0:1–9. doi:10.1136/annrheumdis-2020-218636 5
Systemic lupus erythematosus
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
Table 2 Gene-set analyses of SLE-associated genes and involved pathways
Set name Genes tested No of SNVs all/common/rare FDRALL FDRCOMMON FDRRARE
Interferon (ref 21) 33 4204/849/2866 0.0018 0.66 0.65
Interferonopathy (ref 22,23) 11 2034/463/1271 0.0028 4.1E-07 0.24
SLE GWAS (ref 3,4) 88 18790/5326/11465 1.5E-12 2.0E-15 0.18
Complement* 32 4712/1094/3086 0.00071 2.8E-07 0.20
Monogenic SLE (ref 24) 24 3745/930/2371 2.9E-07 2.9E-11 0.020
All: including all MAFs; Common: MAF >0.05; Rare: MAF <0.01.
*The complement part of KEGG pathway hsa04610.
FDR, false discovery rate; GWAS, genome-wide association study; KEGG, kyoto encyclopedia of genes and genomes; MAF, minor allele frequency; SLE, systemic lupus
erythematosus; SNV, single-nucleotide variant.
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
2
genes also contribute to risk for SLE. In addition to interferonop- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology,
athy genes, we also observed an aggregate genetic association for Uppsala University, Uppsala, Sweden
3
Department of Psychiatry, Washington University, St. Louis, Missouri, USA
monogenic SLE and lupus-like disease genes with both a rare and 4
Division of Rheumatology, Department of Medicine, Karolinska Institutet and
a common variant contribution. This supports the hypothesis of a Karolinska University Hospital, Stockholm, Sweden
5
shared genetic basis and consequently disease mechanisms between Department of Public Health and Clinical Medicine/Rheumatology, Umeå University,
monogenic and complex forms of disease, where also common Umeå, Sweden
6
Broegelmann Research Laboratory, Department of Clinical Science, University of
non-coding variants can affect the regulation of Mendelian disease Bergen, Bergen, Norway
genes resulting in clinically similar traits.39 7
Clinical Immunology unit, Department of Internal Medicine, Stavanger University
We have previously demonstrated that an SLE genetic risk score Hospital, Stavanger, Norway
8
was associated with disease severity in SLE.5 We here generated a Department of Medical Genetics, University of Oslo, Oslo, Norway
9
Institute for Inflammation Research, Center for Rheumatology and Spine Diseases,
pathway-centred SLE PRS and found that there was a large variation Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark
in the number of affected pathways among the patients, which under- 10
Department of Clinical Immunology, Aalborg University, Aalborg, Denmark
11
scores the heterogeneity of SLE. We observed higher SLE damage Department of Clinical Immunology, Odense University Hospital, Odense, Denmark
12
indexes in patients with SLE positive for the B or T cell receptor Department of Rheumatology, Oslo University Hospital, Oslo, Norway
13
Institute of Clinical Medicine, University of Oslo, Oslo, Norway
signalling pathways, thus, pathways in the adaptive immune system 14
Department of Rheumatology, Odense University Hospital, Odense, Denmark
seem important for the long-term severity of the disease. This is in 15
Department of Rheumatology, Aarhus University Hospital, Aarhus, Denmark
16
accordance with previous findings that SLE disease activity correlates Institute of Clinical Medicine, Aarhus University, Aarhus, Denmark
17
with abnormal B lymphocyte activity and T cell abnormalities, as Center for Rheumatology and Spine Diseases, Copenhagen University Hospital
Rigshospitalet, Copenhagen, Denmark
well as the connection between disease activity and accumulation of 18
Institute of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
organ damage.40 41 19
Department of Medical Sciences, Molecular Medicine and Science for Life
We attempted to cluster patients into subsets with shared Laboratory, Uppsala University, Uppsala, Sweden
20
genetic pathway profiles, which suggested four subgroups of Department of Clinical Sciences Lund, Rheumatology, Lund University, Skane
University Hospital, Lund, Sweden
patients with SLE. Beside the SSA/SSB antibody profile, these 21
Department of Biomedical and Clinical Sciences, Division of Inflammation and
clusters were not connected to clinical disease manifestations Infection, Linköping University, Linköping, Sweden
22
such as nephritis or survival. This observation may indicate that Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
the PRS reflects part of the central autoimmune process, which Acknowledgements DNA sequencing and genotyping was performed at the
is not translated into specific organ manifestations. Whether the SNP&SEQ Technology Platform in Uppsala. The facility is part of the National
PRS in individual patients with SLE, or the different clusters, Genomics Infrastructure (NGI) Sweden and Science for Life Laboratory. Computations
were performed on resources provided by the Swedish National Infrastructure
contribute to treatment response is an interesting possibility, but for Computing (SNIC) through Uppsala Multidisciplinary Centre for Advanced
could not be assessed in this study. This is one limitation of our Computational Science (UPPMAX) under projects SNIC SENS 2017142 and
study, together with the fact that our conclusions apply specifi- 2017107. The SweGen genotype data were generated by Science for Life Laboratory.
cally to this set of candidate genes. The authors would like to thank the Genome Aggregation Database (gnomAD)
and the groups that provided exome and genome variant data to this resource. A
WGS or WES studies will be required to fully elucidate the role of full list of contributing groups can be found at the gnomAD website. The authors
rare variants and pathways in SLE. As previously shown by us and wish to thank the Uppsala Bioresource Karolina Tandre and Västerbotten biobank
others, WGS and WES in selected patients can provide information for providing DNA samples on control individuals, and Cane Amcoff for excellent
on ultrarare and de novo SNVs in SLE.6 7 42 However, larger sample technical assistance.
sizes than those reported to date will be required to paint a complete Collaborators The DISSECT consortium: Johanna K. Sandling (Department of
picture of the genetic aetiology of SLE. We did not find support Medical Sciences, Rheumatology, Uppsala University, Uppsala, Sweden), Pascal
Pucholt (Department of Medical Sciences, Rheumatology, Uppsala University,
in additional Scandinavian cohorts for a role in SLE for the novel Sweden), Lina Hultin Rosenberg (Science for Life Laboratory, Department of Medical
loci identified in the Swedish cohorts. Possible explanations include Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden), Fabiana
overestimated effect sizes in the discovery cohort, differences in H.G. Farias (Science for Life Laboratory, Department of Medical Biochemistry and
genetic background within Scandinavia, or differences in clinical Microbiology, Uppsala University, Uppsala, Sweden, and Department of Psychiatry,
Washington University, St. Louis, MO, USA), Sergey V. Kozyrev (Science for Life
manifestations or characterisation of patients. Lastly, our study iden- Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala
tified a large number of case-only coding variants. Variants uniquely University, Uppsala, Sweden), Maija-Leena Eloranta (Department of Medical
identified in patients could be causal candidates in SLE, but their Sciences, Rheumatology, Uppsala University, Uppsala, Sweden), Andrei Alexsson
statistical significance is difficult to evaluate. (Department of Medical Sciences, Rheumatology, Uppsala University, Uppsala,
Sweden), Matteo Bianchi (Science for Life Laboratory, Department of Medical
In summary, we have suggested a novel strategy to genetically Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden), Leonid
stratify patients with SLE according to involved molecular pathways. Padyukov (Division of Rheumatology, Department of Medicine, Karolinska Institutet
T cell pathways displayed the strongest association, which highlights and Karolinska University Hospital, Stockholm, Sweden), Christine Bengtsson
the importance of the adaptive immune system in the disease. The (Department of Public Health and Clinical Medicine/Rheumatology, Umeå University,
Umeå, Sweden), Roland Jonsson (Broegelmann Research Laboratory, Department
strong connection to the JAK-STAT pathway, including the IFN of Clinical Science, University of Bergen, Bergen, Norway), Roald Omdal (Clinical
system, is perhaps not surprising given the promising clinical trials Immunology unit, Department of Internal Medicine, Stavanger University Hospital,
of JAK and type I interferon receptor inhibition as treatments for Stavanger, Norway and Broegelmann Research Laboratory, Department of Clinical
SLE.38 43 44 However, not all patients in these studies respond to Science, University of Bergen, Bergen, Norway), Øyvind Molberg (Department of
Rheumatology, Oslo University Hospital and Institute of Clinical Medicine, University
treatment, and dissecting affected molecular pathways in responders of Oslo, Oslo, Norway), Ann-Christine Syvänen (Department of Medical Sciences,
and non-responders could increase the understanding of treatment Molecular Medicine and Science for Life Laboratory, Uppsala University, Uppsala,
outcome. This approach has not been tested clinically, but the future Sweden), Andreas Jönsen (Lund University, Skane University Hospital, Department
of precision medicine for SLE lies in identifying robust methods to of Clinical Sciences Lund, Rheumatology, Lund, Sweden), Iva Gunnarsson (Division
of Rheumatology, Department of Medicine Solna, Karolinska Institutet, Karolinska
perform molecular stratification of patients. University Hospital, Stockholm, Sweden), Elisabet Svenungsson (Division of
Rheumatology, Department of Medicine Solna, Karolinska Institutet, Karolinska
Author affiliations University Hospital, Stockholm, Sweden), Solbritt Rantapää-Dahlqvist (Department
1
Department of Medical Sciences, Rheumatology, Uppsala University, Uppsala, of Public Health and Clinical Medicine/Rheumatology, Umeå University, Umeå,
Sweden Sweden), Anders A. Bengtsson (Lund University, Skane University Hospital,
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
Department of Clinical Sciences Lund, Rheumatology, Lund, Sweden), Christopher Ethics approval The study was approved by the regional ethics board in Uppsala
Sjöwall (Department of Biomedical and Clinical Sciences, Division of Inflammation (Dnr 2015/450 and 2016/155).
and Infection, Linköping University, Linköping, Sweden), Dag Leonard (Department Provenance and peer review Not commissioned; externally peer reviewed.
of Medical Sciences, Rheumatology, Uppsala University, Uppsala, Sweden), Kerstin
Lindblad-Toh (Science for Life Laboratory, Department of Medical Biochemistry Data availability statement The datasets generated and/or analysed during
and Microbiology, Uppsala University, Uppsala, Sweden and Broad Institute of the current study are not publicly available due to them containing information that
MIT and Harvard, Cambridge, MA, USA), Lars Rönnblom (Department of Medical could compromise research participant privacy and consent, but are available from
Sciences, Rheumatology, Uppsala University, Uppsala, Sweden), Jonas Carlsson the corresponding authors LR (ORCID 0000-0001-9403-6503) and JKS (ORCID
Almlöf (Department of Medical Sciences, Molecular Medicine and Science for Life 0000-0003-1382-2321) on reasonable request and on a collaborative basis.
Laboratory, Uppsala University, Uppsala, Sweden), Johanna Dahlqvist (Science Supplemental material This content has been supplied by the author(s). It
for Life Laboratory, Department of Medical Sciences and Department of Medical has not been vetted by BMJ Publishing Group Limited (BMJ) and may not
Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden), Daniel have been peer-reviewed. Any opinions or recommendations discussed are
Eriksson (Department of Medicine (Solna), Karolinska Institutet, and Department of solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all
Endocrinology, Metabolism and Diabetes Karolinska University Hospital, Stockholm, liability and responsibility arising from any reliance placed on the content.
Sweden), Niklas Hagberg (Department of Medical Sciences, Rheumatology, Uppsala Where the content includes any translated material, BMJ does not warrant the
University, Uppsala, Sweden), Ingrid E. Lundberg (Division of Rheumatology, accuracy and reliability of the translations (including but not limited to local
Department of Medicine and Center for Molecular Medicine, Karolinska Institutet, regulations, clinical guidelines, terminology, drug names and drug dosages), and
Stockholm, Sweden), Argyri Mathioudaki (Science for Life Laboratory, Department is not responsible for any error and/or omissions arising from translation and
of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden), adaptation or otherwise.
Jennifer Meadows (Science for Life Laboratory, Department of Medical Biochemistry
and Microbiology, Uppsala University, Uppsala, Sweden), Jessika Nordin (Science Open access This is an open access article distributed in accordance with the
for Life Laboratory, Department of Medical Biochemistry and Microbiology, Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits
Uppsala University, Uppsala, Sweden), Gunnel Nordmark (Department of Medical others to copy, redistribute, remix, transform and build upon this work for any
Sciences, Rheumatology, Uppsala University, Uppsala, Sweden), Marie Wahren- purpose, provided the original work is properly cited, a link to the licence is given,
Herlenius (Department of Medicine, Division of Rheumatology, Karolinska Institutet, and indication of whether changes were made. See: https://creativecommons.org/
Karolinska University Hospital, Stockholm, Sweden and Broegelmann Research licenses/by/4.0/.
Laboratory, Department of Clinical Science, University of Bergen, Norway), Sule
ORCID iDs
Yavuz (Department of Medical Sciences, Rheumatology, Uppsala University, Uppsala,
Johanna K Sandling http://orcid.org/0000-0003-1382-2321
Sweden). The ImmunoArray development consortium: Kerstin Lindblad-Toh (Science
Pascal Pucholt http://orcid.org/0000-0003-3342-1373
for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala
Elisabet Svenungsson http://orcid.org/0000-0003-3396-3244
University, Uppsala, Sweden and Broad Institute of MIT and Harvard, Cambridge,
Christopher Sjöwall http://orcid.org/0000-0003-0900-2048
MA, USA), Gerli Rosengren Pielberg (Science for Life Laboratory, Department of
Lars Rönnblom http://orcid.org/0000-0001-9403-6503
Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden), Anna
Lobell (Office for Medicine and Pharmacy, Uppsala University, Uppsala, Sweden),
REFERENCES
Åsa Karlsson (Science for Life Laboratory, Department of Medical Biochemistry
1 Bengtsson AA, Rönnblom L. Systemic lupus erythematosus: still a challenge for
and Microbiology, Uppsala University, Uppsala, Sweden), Eva Murén (Science for
physicians. J Intern Med 2017;281:52–64.
Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala
2 Crow MK, Ronnblom L. Type I interferons in host defence and inflammatory diseases.
University, Uppsala, Sweden), Göran Andersson (Department of Animal Breeding and
Lupus Sci Med 2019;6:e000336.
Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden), Kerstin M.
3 Chen L, Morris DL, Vyse TJ. Genetic advances in systemic lupus erythematosus: an
Ahlgren (Department of Surgical Sciences, Uppsala University, Uppsala, Sweden),
update. Curr Opin Rheumatol 2017;29:423–33.
Lars Rönnblom (Department of Medical Sciences, Rheumatology, Uppsala University,
4 Langefeld CD, Ainsworth HC, Cunninghame Graham DS, et al. Transancestral mapping
Uppsala, Sweden), Maija-Leena Eloranta (Department of Medical Sciences, and genetic load in systemic lupus erythematosus. Nat Commun 2017;8:16021.
Rheumatology, Uppsala University, Uppsala, Sweden), Nils Landegren (Department 5 Reid S, Alexsson A, Frodlund M, et al. High genetic risk score is associated with
of Medicine (Solna), Center for Molecular Medicine, Karolinska Institutet, Stockholm, early disease onset, damage accrual and decreased survival in systemic lupus
Sweden and Science for Life Laboratory, Department of Medical Sciences, Uppsala erythematosus. Ann Rheum Dis 2020;79:363–9.
University, Uppsala, Sweden), Olle Kämpe (Department of Medicine (Solna), Center 6 Pullabhatla V, Roberts AL, Lewis MJ, et al. De novo mutations implicate novel genes in
for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden, Department of systemic lupus erythematosus. Hum Mol Genet 2018;27:421–9.
Endocrinology, Metabolism and Diabetes Karolinska University Hospital, Stockholm, 7 Almlöf JC, Nystedt S, Leonard D, et al. Whole-genome sequencing identifies complex
Sweden, Science for Life Laboratory, Department of Medical Sciences, Uppsala contributions to genetic risk by variants in genes causing monogenic systemic lupus
University, Uppsala, Sweden and KG Jebsen Center for autoimmune diseases, erythematosus. Hum Genet 2019;138:141–50.
University of Bergen, Norway), Peter Söderkvist (Division of Cell Biology, Department 8 Jiang SH, Athanasopoulos V, Ellyard JI, et al. Functional rare and low frequency
of Biomedical and Clinical Sciences, Linköping University, Linköping, Sweden). variants in BLK and BANK1 contribute to human lupus. Nat Commun 2019;10:2201.
Contributors KLT and LR conceived and designed the experiments. CB, KL, AV, 9 Wang Y, Chen S, Chen J, et al. Germline genetic patterns underlying familial
AMT, SJ, AJ, IG, ES, SR-D, AAB, CS and DL characterised the patient samples. M-LE, rheumatoid arthritis, systemic lupus erythematosus and primary Sjögren’s syndrome
LP, LM, RS, MAJ, RJ, RO, BAL and STL provided samples and data. The ImmunoArray highlight T cell-initiated autoimmunity. Ann Rheum Dis 2020;79:268–75.
development consortium members designed the targeted sequencing panel. FHGF, 10 Raj P, Rai E, Song R, et al. Regulatory polymorphisms modulate the expression of HLA
SVK, ÅK, and EM performed the experiments. A-CS managed the sequencing class II molecules and promote autoimmunity. Elife 2016;5. doi:10.7554/eLife.12089.
platform. JKS, LHR, PP, AA and MB analysed the data and the DISSECT consortium [Epub ahead of print: 15 Feb 2016].
members provided intellectual input and/or developed analysis pipelines. JKS and 11 Richardson TG, Timpson NJ, Campbell C, et al. A pathway-centric approach to rare
LR wrote the manuscript. All authors read, provided critical review and accepted the variant association analysis. Eur J Hum Genet 2016;25:123–9.
final version of the manuscript. 12 Banchereau R, Hong S, Cantarel B, et al. Personalized Immunomonitoring uncovers
molecular networks that stratify lupus patients. Cell 2016;165:551–65.
Funding This study was supported by an AstraZeneca-Science for Life Laboratory
13 El-Sherbiny YM, Psarras A, Md Yusof MY, et al. A novel two-score system for interferon
Research Collaboration grant (DISSECT). This study was also supported by grants
status segregates autoimmune diseases and correlates with clinical features. Sci Rep
from the Swedish Research Council for Medicine and Health (Dnr 2018–02399,
2018;8:5793.
2018–02535 and a Distinguished professor award to KL-T), the Swedish
14 Guthridge JM, Lu R, Tran LT-H, et al. Adults with systemic lupus exhibit distinct
Rheumatism Association, King Gustav V’s 80-year Foundation, the Swedish-Heart- molecular phenotypes in a cross-sectional study. EClinicalMedicine 2020;20:100291.
Lung foundation, a Wallenberg Scholar Award (to KL-T), and the Swedish Society of 15 Lindblad-Toh K, Garber M, Zuk O, et al. A high-resolution map of human evolutionary
Medicine and the Ingegerd Johansson donation. The SNP&SEQ Platform is supported constraint using 29 mammals. Nature 2011;478:476–82.
by Science for Life Laboratory, the Swedish Research Council (VR-RFI), Uppsala 16 Tan EM, Cohen AS, Fries JF, et al. The 1982 revised criteria for the classification of
University and the Knut and Alice Wallenberg Foundation. systemic lupus erythematosus. Arthritis Rheum 1982;25:1271–7.
Disclaimer The funders had no role in the design of the study or collection, 17 Hochberg MC. Updating the American College of rheumatology revised criteria for the
analysis, or interpretation of data or in writing the manuscript. classification of systemic lupus erythematosus. Arthritis Rheum 1997;40:1725.
18 Eriksson D, Bianchi M, Landegren N, et al. Extended exome sequencing identifies BACH2 as
Competing interests An AstraZeneca research collaboration grant to LR
a novel major risk locus for Addison’s disease. J Intern Med 2016;280:595–608.
supported the study. LHR is now an employee of Olink Proteomics.
19 Li JZ, Absher DM, Tang H, et al. Worldwide human relationships inferred from
Patient consent for publication Not required. genome-wide patterns of variation. Science 2008;319:1100–4.
Ann Rheum Dis: first published as 10.1136/annrheumdis-2020-218636 on 9 October 2020. Downloaded from http://ard.bmj.com/ on December 7, 2020 by guest. Protected by copyright.
20 Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids 33 Krebs CF, Schmidt T, Riedel J-H, et al. T helper type 17 cells in immune-mediated
Res 2000;28:27–30. glomerular disease. Nat Rev Nephrol 2017;13:647–59.
21 Rönnblom L, Leonard D. Interferon pathway in SLE: one key to unlocking the mystery 34 Koga T, Ichinose K, Kawakami A, et al. The role of IL-17 in systemic lupus
of the disease. Lupus Sci Med 2019;6:e000270. erythematosus and its potential as a therapeutic target. Expert Rev Clin Immunol
22 Rodero MP, Crow YJ. Type I interferon-mediated monogenic autoinflammation: the 2019;15:629–37.
type I interferonopathies, a conceptual overview. J Exp Med 2016;213:2527–38. 35 He J, Zhang R, Shao M, et al. Efficacy and safety of low-dose IL-2 in the treatment of
23 Davidson S, Steiner A, Harapas CR, et al. An update on autoinflammatory diseases: systemic lupus erythematosus: a randomised, double-blind, placebo-controlled trial.
Interferonopathies. Curr Rheumatol Rep 2018;20:38. Ann Rheum Dis 2020;79:141–9.
24 Tsokos GC, Lo MS, Costa Reis P, et al. New insights into the immunopathogenesis of 36 Kamitaki N, Sekar A, Handsaker RE, et al. Complement genes contribute sex-biased
systemic lupus erythematosus. Nat Rev Rheumatol 2016;12:716–30. vulnerability in diverse disorders. Nature 2020;582:577–81.
25 Lee S, Emond MJ, Bamshad MJ, et al. Optimal unified approach for rare-variant 37 Villarino AV, Kanno Y, O’Shea JJ. Mechanisms and consequences of Jak-STAT signaling
association testing with application to small-sample case-control whole-exome in the immune system. Nat Immunol 2017;18:374–84.
sequencing studies. Am J Hum Genet 2012;91:224–37. 38 Alunno A, Padjen I, Fanouriakis A, et al. Pathogenic and therapeutic relevance of JAK/
26 Mossotto E, Ashton JJ, O’Gorman L, et al. GenePy - a score for estimating STAT signaling in systemic lupus erythematosus: integration of distinct inflammatory
pathways and the prospect of their inhibition with an oral agent. Cells 2019;8.
gene pathogenicity in individuals using next-generation sequencing data. BMC
doi:10.3390/cells8080898. [Epub ahead of print: 15 Aug 2019].
Bioinformatics 2019;20:254.
39 Freund MK, Burch KS, Shi H, et al. Phenotype-Specific enrichment of Mendelian
27 Ameur A, Dahlberg J, Olason P, et al. SweGen: a whole-genome data resource of
disorder genes near GWAS regions across 62 complex traits. Am J Hum Genet
genetic variability in a cross-section of the Swedish population. Eur J Hum Genet
2018;103:535–52.
2017;25:1253–60.
40 Li W, Deng C, Yang H, et al. The regulatory T cell in active systemic lupus
28 Karczewski KJ, Francioli LC, Tiao G, et al. Variation across 141456 human exomes and erythematosus patients: a systemic review and meta-analysis. Front Immunol
genomes reveals the spectrum of loss-of-function intolerance across human protein- 2019;10:159.
coding genes. bioRxiv 2019:531210. 41 Nossent J, Cikes N, Kiss E, et al. Current causes of death in systemic lupus
29 Frodlund M, Reid S, Wetterö J, et al. The majority of Swedish systemic lupus erythematosus in Europe, 2000--2004: relation to disease activity and damage
erythematosus patients are still affected by irreversible organ impairment: factors accrual. Lupus 2007;16:309–17.
related to damage accrual in two regional cohorts. Lupus 2019;28:1261–72. 42 Almlöf JC, Nystedt S, Mechtidou A, et al. Contributions of de novo variants to systemic
30 Hunninghake GM, Hatabu H, Okajima Y, et al. MUC5B promoter polymorphism and lupus erythematosus. Eur J Hum Genet 2020. doi:10.1038/s41431-020-0698-5.
interstitial lung abnormalities. N Engl J Med 2013;368:2192–200. [Epub ahead of print: 28 Jul 2020].
31 Namba N, Kawasaki A, Sada K-E, et al. Association of MUC5B promoter 43 Wallace DJ, Furie RA, Tanaka Y, et al. Baricitinib for systemic lupus erythematosus:
polymorphism with interstitial lung disease in myeloperoxidase-antineutrophil a double-blind, randomised, placebo-controlled, phase 2 trial. Lancet
cytoplasmic antibody-associated vasculitis. Ann Rheum Dis 2019;78:1144–6. 2018;392:222–31.
32 Crispin JC, Hedrich CM, Suárez-Fueyo A, et al. SLE-Associated defects promote altered 44 Morand EF, Furie R, Tanaka Y, et al. Trial of Anifrolumab in active systemic lupus
T cell function. Crit Rev Immunol 2017;37:39–58. erythematosus. N Engl J Med 2020;382:211–21.
Contents:
The Swedish SLE cohorts included 1,167 SLE patients recruited at the Rheumatology clinics at the
Uppsala, Karolinska (Solna), Umeå, Lund and Linköping University Hospitals. Blood samples and
clinical information originated from time of diagnosis, study inclusion or follow-up visits, and clinical
information was compiled at the end of follow-up. An extended follow-up was performed specifically
for death as outcome. The controls were healthy blood donors or population controls from Uppsala
Bioresource and Västerbotten biobank in Sweden (n=1,101). Genomic DNA extracted from blood
samples was available for genetic analysis. DNA samples for sequencing were selected based on DNA
amount and quality if multiple DNA samples were available for the same individual. The quality-
controlled dataset used in subsequent analyses contained 958 SLE patients and 1,026 control
individuals. All 958 SLE patients fulfilled at least four of the classification criteria for SLE as defined by
the American College of Rheumatology (ACR).(1, 2) Renal biopsies were classified according to the
WHO or the ISN/RPS 2003 classification systems.(3) Clinical characteristics of the patients are
available in online supplementary Tables S1a and S1b. All subjects provided informed consent to
participate in the study, and the study was approved by the regional ethics board in Uppsala (Dnr
Targeted DNA sequencing was performed in the Swedish SLE case-control cohorts. The design of the
sequence capture panel and the library preparation has been described elsewhere.(4) In brief, a
custom SeqCap EZ Choice XL library (Roche NimbleGen, Basel, Switzerland) was designed to target
1,853 genes, selected based on their known or suspected roles in immunological or autoimmune
diseases in humans or model organisms.(4) The genomic intervals for all alternative transcripts were
retrieved from NCBI36/hg18. Besides the coding exons, 5′ and 3′ UTRs, potential promoter regions
(±2 kb from transcription start sites) and splice sites (±20 bp of intronic sequences adjacent to exons)
were also included, as well as regions of mammalian conservation within 100kb up- and downstream
of the genes.(5) In total, the designed probes covered 32.3 Mbp. Sequencing libraries were prepared
by ultrasonication of up to 1 μg of high molecular weight DNA into around 400 bp fragments (Covaris
E220, Woburn, MA, USA), that were then barcoded (NEXTflex-96 DNA barcode adapters, Bioo
Scientific, Austin, TX, USA). Samples were pooled in batches of eight, hybridized (Roche NimbleGen)
and sequenced with 100-bp paired-end reads using Illumina HiSeq 2500 version 3 or 4 chemistry
(Illumina Inc, San Diego, CA, USA). An average sequencing depth of 35× per sample was achieved.
A pipeline based on GATK “best practices” was used for variant discovery.(6) Raw reads were
mapped to the hg19 human reference genome using the Burrows-Wheeler aligner 0.7.12 (7) and
duplicate reads marked by Picard 1.92. GATK 3.3.0 was applied for realignment around indels, base
quality score recalibration, SNP and indel discovery and joint genotyping. Prior to genotyping,
alignment quality was evaluated by Samtools flagstat (7) and Picard tools CalculateHSMetrics and
samples with mean target coverage less than 10x were excluded. From this point on, only bi-allelic
single nucleotide variants (SNVs) were considered. SNV quality scores were recalibrated using GATK
3.3.0 VariantRecalibrator and filtered at tranche level 99.0. Using VCFtools,(8) genotype calls with
depth less than 8 reads and genotype Phred quality score less than 20 were excluded.(9)
Study population genetic structure was analysed by the LASER software using default parameters and
the Human Genome Diversity Project (HGDP) as reference population (online supplementary Figure
S7a).(10, 11) Population outliers were defined using the following criteria: 1) study subjects falling
more than five standard deviations outside of the mean of the European sub-population of the HGDP
reference set were excluded, 2) mean and standard deviation were calculated for the remaining
study subjects and any additional subjects falling more than five standard deviations outside of the
study mean were excluded, 3) step 2 was repeated until no additional subjects were excluded.
Relatedness among study subjects was determined using the KING software, applying default
thresholds for duplicate and first degree relationships.(12) Extreme sample outliers were identified
based on several quality control (QC) measures, as suggested by Do et al.(13) These QC parameters
included rate of missing data, heterozygosity ratio, transition-transversion ratio and singleton counts.
Further, samples were excluded if they exhibited discordance between reported sex and that
inferred from sequence data or if they exhibited discordance between genotypes inferred from
sequence data and a genotype dataset from a previous study.(14) Lastly, it was required that samples
A number of filters were applied to exclude low quality variants. Heterozygous calls were included
only if their allelic balance across all samples was between 0.2 and 0.8. Positions deviating from
monomorphic sites. Finally, a minimum of 90% variant call rate was required. The remaining variant
positions were investigated for differential missingness between cases and controls using PLINK (15),
and significantly different variants were excluded (P <0.05 Bonferroni corrected). An overview of the
QC steps can be found in online supplementary Figure S1. The quality-controlled dataset used in
subsequent analyses contained 958 Swedish SLE patients, 1,026 control individuals, 287,354 SNVs
and covered 1,832 of the targeted gene regions. The average individual call rate was 98.2% and the
average variant call rate 98.2%. Genotypes from targeted sequencing were validated using an
independent genotype array dataset (Illumina ImmunoChip) on an overlapping set of 1,693 Swedish
individuals and 8,483 SNVs after QC.(14) SNV genotype concordance was on average 99.8% (online
Variant annotation was performed using SnpEff v4.2.(16) Non-synonymous variants were defined as
SNVs annotated as missense or nonsense variants. Non-coding SNVs were defined as SNVs annotated
sites, but not as missense or nonsense SNVs. The extended HLA region spanning a region of 7.9 Mbp
Evolutionarily constrained positions were defined as having a Genomic Evolutionary Rate Profiling
(GERP) rejected substitutions (RS) score >2.(18) In analyses of rare SNVs, variants with MAFs <0.01
were included, and for common SNVs variants with MAFs ≥0.05 were included.
Principal components for population stratification were generated in EIGENSOFT (19) after excluding
long-range linkage disequilibrium (LD) regions (20), SNVs with MAF<0.05 and SNVs in LD r2>0.2
(online supplementary Figure S7b-d). Single variant association analyses for variants with MAF≥0.01
were performed in PLINK using a logistic regression model, in which the three first principal
components were added as regression covariates. Two levels of significance were applied, an
experiment-wide P-value threshold of 1.8 x10-6 (P < 0.05 Bonferroni corrected, limiting LD to r2<0.2
which resulted in 27,195 variants used for multiple testing correction) and a suggestive threshold of
P<1x10-4. LD was measured by r2 calculated in PLINK. Manhattan and QQ plots were generated in R
using the package qqman.(21, 22) Regional plots of associations were generated using R. Conditional
analysis, using the top SNP from the previous model as covariate, was performed until there were no
residual association signals below the suggestive threshold (P <1x10-4). Differences in variant load
between SLE patients and controls were assessed by the Mann-Whitney U test. SLE case-only
variants were identified by removing all SNVs present in our control dataset of 1,026 individuals, in
the SweGen project 1,000 individuals version September 4th, 2017 generated by Science for Life
v2.1.1.(24)
Variant-sets were generated for aggregate association testing using three different strategies. 1)
Gene variant-sets for gene-based association testing: The RefSeq annotation of the hg19 human
genome assembly was used to assign genomic positions to each target gene.(25) Aggregate spaces
were generated such that the minimal transcript start site and the maximal transcript end site of any
transcript for each gene was recorded. The spaces were then extended by 100kb on each end to
include regulatory regions, except in analyses focusing on rare coding variants only. Variants falling
within the same aggregate gene space were assembled into a gene variant set. 2) Pathway variant-
sets for pathway-based association testing: Pathway-wide aggregate spaces were generated by
utilising information from the Kyoto Encyclopedia of Genes and Genomes (KEGG) on membership of
genes in pathways.(26) Pathway spaces were defined as the union of gene spaces of genes annotated
to be part of each pathway. Association testing was performed only for pathways that were
represented by at least five genes in the sequencing data, and where at least 50% of the genes in the
pathway were targeted. Additionally, the Human Diseases class of pathways were excluded. This
resulted in 35 KEGG pathway variant-sets for association testing. 3) Literature review gene sets for
gene set-based association testing: the type I interferon pathway (27), interferonopathy genes (28,
29), gene variant sets for SLE GWAS genes (14, 30), the complement subset of the Complement and
coagulation cascades pathway (KEGG hsa04610) and genes causing monogenic SLE or lupus-like
Aggregate association testing was performed separately for each variant-set using SKAT-O with the
employed a weighted linear kernel using the default weights as calculated internally by the beta
distribution with parameters a=1 and b=25, giving higher weight to rare variants. To ensure
reproducible outcomes we set the random number seed value in R to 1,337 before running SKAT-O.
For P-value calculation we used the “hybrid” approach that selects the optimal method based on the
total minor allele count (MAC), the number of individuals with minor alleles (m), and the degree of
case-control imbalance. This corrects for conservative type I errors when using a small sample size.
FDR were controlled separately for the pathway, gene-set and gene-based SKAT-O aggregate
association analyses.
Gene-based aggregate association testing including variant deleteriousness metrics was performed
with GenePy.(33) Region annotation and the gene space was dictated by Annovar.(34) GenePy was
run with default parameters, using as reference allele frequencies those in the non-Finnish European
gnomAD v2.1.1 125,748 exomes dataset. The gene annotation was based on RefSeq (RefSeq gene
body + 1Kb upstream and 1Kb downstream). The gene score P-value was obtained by comparing the
distribution of gene scores in cases vs controls using a Mann-Whitney U test. Genes were considered
statistically significant if their P-value was below the permutation P-value. The permutation threshold
was the P-value corresponding to the 5% right tail of the distribution of the lowest P-values obtained
by shuffling the phenotypes (disease status) 1,000 times and running a Mann-Whitney U test. Results
using the REVEL (prediction of the pathogenicity of rare missense variants) and CADD v1.3 (63
annotations, including conservation metrics, functional genomic data, transcript information and
Cumulative pathway SLE polygenic risk scores (pathway PRSs) were assigned to each individual based
on SNVs associated with SLE at nominal significance (P <0.05) in the SLE case-control single variant
association study. The Plink function “clump” was used to remove SNV in high LD (r2 > 0.2) within 250
kbp and to only retain those variants with the highest phenotype association. 1,296 SLE associated
SNVs were retained. Then, for each SNV, the natural logarithm of the OR for SLE susceptibility was
multiplied by the number of minor alleles in each individual. The sum of all products of all genes in
each of the 35 KEGG pathways for each patient was defined as the individual pathway PRS.
Hierarchical cluster analysis with complete linkage on the Euclidean distance between scaled
individual level pathway PRS was used to identify clusters of SLE patients. The NBClust R package was
used to determine the optimal number of clusters by majority voting and four clusters were
determined to be optimal.(37) A heatmap of scaled values of pathway specific PRS was plotted using
the R package ComplexHeatmap.(38) A Chi² test was used to determine if the clusters differed in
composition for case/control status or dichotomous sub phenotypes in SLE patients, while a Mann-
Whitney U test was used to determine if quantitative traits differ between the SLE patients in both
clusters, or if the pathway PRS values differed between SLE cases and healthy controls. Kruskal–
Wallis one-way analysis of variance was used to determine if pathway PRS values differ between
clusters.
The replication study included Norwegian and Danish SLE cohorts recruited at the Oslo University
Hospital, Rigshospitalet in Copenhagen, Odense University Hospital and Aarhus University Hospital.
Only SLE patients fulfilling the ACR SLE classification criteria and of self-reported European ancestry
were included in association analyses. Norwegian and Danish control individuals from the University
Hospitals in Stavanger, Bergen, Odense, Aalborg and Rigshospitalet in Copenhagen were also
included. All subjects provided informed consent to participate in the study, and the study was
approved by the regional ethics boards. 20 SNVs representing association signals at three loci
(CAPN13, IFNK/MOB3B, HAL) or their proxies (LD r2≥0.99) were either genotyped or
Genotyping was performed using the iPLEX chemistry on a MassARRAY system (Agena Bioscience,
San Diego, CA, USA). QC included a minimum per sample call rate of 90% and a per variant call rate of
90%. Variants with differential missingness between cases and controls (P <0.01) or Hardy-Weinberg
equilibrium (P <0.01, in controls) were excluded. 836 Norwegian and Danish SLE patients and 782
Danish healthy control individuals passed QC. Quality-controlled genotype data for 143 Norwegian
healthy control individuals was extracted from targeted sequencing data.(39) Replication variants not
called in the sequencing data were imputed with the Sanger imputation service using the Haplotype
reference consortium r1.1 reference dataset and the “pre-phase with EAGLE2 and impute”
pipeline.(40) Imputed genotype calls with genotype probabilities below 0.9 were set to missing and
SNVs with a MAF below 0.01, a Hardy-Weinberg equilibrium P<0.0001, a call rate below 95% or an
imputation probability score below 0.8 were removed, as were individuals with a call rate below
95%.
124 Norwegian control individuals had been genotyped on the Illumina 550 K BeadChip and 298
individuals on the Affymetrix Genome-Wide Human SNP Array 6.0. Hg19 genome assembly genomic
position of variants were assigned based on the rs IDs using the dbSNP version 152 for Illumina or
using the annotation file for Affymetrix. Prior to imputation the datasets were filtered for 95% call
rate both on the variant and individual level, a minimum MAF of 0.05 and a HWE P>1×10-4. Variants
were strand flipped to match the reference allele and variants that could not be resolved were
removed. The resulting datasets were imputed and filtered in the same way as the sequencing-based
dataset described above. After QC the replication dataset included 15 SNVs, 836 SLE patients and
The Swedish SLE case-control study was expanded to include genotypes from an additional 1,000
control individuals from the SweGen project version September 4th, 2017 generated by Science for
Life Laboratory.(23) Genotypes for proxy variants that were part of the replication study genotyping,
but which were not directly called in the targeted sequencing data, were imputed and quality-
controlled as above. Single variant association analyses were performed separately for the expanded
Swedish, the Norwegian and the Danish case-control studies in PLINK using a logistic regression
model. Meta-analysis of the three association studies results was performed in PLINK assuming a
random effects model. The meta-analysis included a total of 1,794 Scandinavian SLE patients and
References
1. Tan EM, Cohen AS, Fries JF, et al. The 1982 revised criteria for the classification of
systemic lupus erythematosus. Arthritis and rheumatism. 1982;25(11):1271-7.
2. Hochberg MC. Updating the American College of Rheumatology revised criteria for the
classification of systemic lupus erythematosus. Arthritis and rheumatism. 1997;40(9):1725.
3. Weening JJ, D'Agati VD, Schwartz MM, et al. The classification of glomerulonephritis in
systemic lupus erythematosus revisited. J Am Soc Nephrol. 2004;15(2):241-50.
4. Eriksson D, Bianchi M, Landegren N, et al. Extended exome sequencing identifies
BACH2 as a novel major risk locus for Addison's disease. J Intern Med. 2016;280(6):595-608.
5. Lindblad-Toh K, Garber M, Zuk O, et al. A high-resolution map of human evolutionary
constraint using 29 mammals. Nature. 2011;478(7370):476-82.
6. DePristo MA, Banks E, Poplin R, et al. A framework for variation discovery and
genotyping using next-generation DNA sequencing data. Nature genetics. 2011;43(5):491-8.
7. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler
transform. Bioinformatics. 2009;25(14):1754-60.
8. Danecek P, Auton A, Abecasis G, et al. The variant call format and VCFtools.
Bioinformatics. 2011;27(15):2156-8.
9. Carson AR, Smith EN, Matsui H, et al. Effective filtering strategies to improve data
quality from population-based whole exome sequencing studies. BMC Bioinformatics. 2014;15:125.
10. Wang C, Zhan X, Bragg-Gresham J, et al. Ancestry estimation and control of population
stratification for sequence-based association studies. Nature genetics. 2014;46(4):409-15.
11. Wang C, Zhan X, Liang L, et al. Improved ancestry estimation for both genotyping and
sequencing data using projection procrustes analysis and genotype imputation. Am J Hum Genet.
2015;96(6):926-37.
12. Manichaikul A, Mychaleckyj JC, Rich SS, et al. Robust relationship inference in genome-
wide association studies. Bioinformatics. 2010;26(22):2867-73.
13. Do R, Stitziel NO, Won HH, et al. Exome sequencing identifies rare LDLR and APOA5
alleles conferring risk for myocardial infarction. Nature. 2015;518(7537):102-6.
14. Langefeld CD, Ainsworth HC, Cunninghame Graham DS, et al. Transancestral mapping
and genetic load in systemic lupus erythematosus. Nature communications. 2017;8:16021.
15. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association
and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559-75.
16. Cingolani P, Platts A, Wang le L, et al. A program for annotating and predicting the
effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster
strain w1118; iso-2; iso-3. Fly. 2012;6(2):80-92.
17. Horton R, Wilming L, Rand V, et al. Gene map of the extended human MHC. Nature
reviews Genetics. 2004;5(12):889-99.
18. Cooper GM, Stone EA, Asimenos G, et al. Distribution and intensity of constraint in
mammalian genomic sequence. Genome Res. 2005;15(7):901-13.
19. Price AL, Patterson NJ, Plenge RM, et al. Principal components analysis corrects for
stratification in genome-wide association studies. Nature genetics. 2006;38(8):904-9.
20. Price AL, Weale ME, Patterson N, et al. Long-range LD can confound genome scans in
admixed populations. Am J Hum Genet. 2008;83(1):132-5; author reply 5-9.
21. RCoreTeam. R: A language and environment for statistical computing. R Foundation
for Statistical Computing, Vienna, Austria. 2013.
22. Turner SD. qqman: an R package for visualizing GWAS results using Q-Q and
manhattan plots. J Open Source Software. 2018;3(25):731.
Supplementary Table S1b. Clinical information on lupus nephritis Swedish SLE cohort.
Sweden (number of
patients)
Renal disorder 338
Renal biopsy data 257
Biopsy confirmed nephritis* 245
Class I-II 33
Class III-IV 153
Class V 41
Other 18
Dialysis or transplantation (ESRD) 35
*Biopsies were classified according to the WHO or the ISN/RPS 2003 classification systems.
ESRD: End-Stage Renal Disease
Supplementary Table S2. 35 analysed KEGG pathways and their genes. Each pathway included at least five genes and half or more of genes needed to be included in the targeted sequencing effort. Only sequenced
genes are listed.
Antigen processing and presentation (hsa04612) B2M, CALR, CANX, CD4, CD74, CD8A, CD8B, CIITA, CREB1, HLA-A, HLA-B, HLA-C, HLA-DOA, HLA-DOB, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQA2, HLA-
DQB1, HLA-DRA, HLA-DRB1, HLA-DRB5, HLA-E, HSPA6, HSPA8, IFNG, KIR2DL1, KIR2DL3, KLRC1, KLRC2, KLRC3, KLRD1, PDIA3, RFX5, RFXANK, RFXAP, TAP1,
TAP2, TAPBP, TNF
Apoptosis - multiple species (hsa04215) APAF1, BAK1, BAX, BCL2, BCL2L1, BCL2L11, BID, BIRC2, BIRC3, BIRC6, CASP3, CASP8, CASP9, CYCS, FADD, MAPK10, MAPK8, MAPK9, NGFR, TNFRSF1A
Apoptosis (hsa04210) AIFM1, AKT1, APAF1, ATM, BAD, BAK1, BAX, BCL2, BCL2A1, BCL2L1, BCL2L11, BID, BIRC2, BIRC3, CAPN1, CAPN2, CASP10, CASP2, CASP3, CASP8, CASP9,
CFLAR, CHUK, CSF2RB, CTSH, CTSK, CYCS, DAXX, DDIT3, DFFA, EIF2S1, ENDOG, FADD, FAS, FASLG, FOS, GADD45B, GADD45G, GZMB, HRAS, IKBKB, IKBKG,
IL3, ITPR1, JUN, KRAS, MAP2K1, MAP2K2, MAP3K14, MAP3K5, MAPK1, MAPK10, MAPK3, MAPK8, MAPK9, NFKB1, NFKBIA, NGF, NRAS, NTRK1, PARP1,
PIK3CA, PIK3CB, PIK3CD, PRF1, RAF1, RELA, RIPK1, TNF, TNFRSF1A, TNFSF10, TP53, TRADD, TRAF1, TRAF2
B cell receptor signaling pathway (hsa04662) AKT1, BCL10, BLNK, BTK, CARD11, CD19, CD22, CD79A, CD79B, CD81, CHUK, CR2, DAPP1, FCGR2B, FOS, GRB2, GSK3B, HRAS, IFITM1, IKBKB, IKBKG, INPP5D,
JUN, KRAS, LILRB1, LILRB4, LYN, MALT1, MAP2K1, MAP2K2, MAPK1, MAPK3, NFATC1, NFATC2, NFATC3, NFKB1, NFKBIA, NFKBIB, NRAS, PIK3CA, PIK3CB,
PIK3CD, PLCG2, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PPP3R2, PRKCB, PTPN6, RAC1, RAC2, RAC3, RAF1, RASGRP3, RELA, SOS1, SYK, VAV1, VAV2
Chemokine signaling pathway (hsa04062) ADCY2, ADCY3, AKT1, BAD, BRAF, CCL1, CCL11, CCL13, CCL19, CCL2, CCL21, CCL24, CCL25, CCL26, CCL3, CCL3L3, CCL4, CCL5, CCL7, CCR1, CCR10, CCR2,
CCR3, CCR4, CCR5, CCR6, CCR7, CCR8, CCR9, CDC42, CHUK, CRK, CRKL, CX3CL1, CX3CR1, CXCL1, CXCL10, CXCL11, CXCL12, CXCL13, CXCL14, CXCL16, CXCL2,
CXCL3, CXCL5, CXCL8, CXCL9, CXCR1, CXCR2, CXCR3, CXCR4, CXCR5, CXCR6, ELMO1, FGR, FOXO3, GRB2, GSK3B, HCK, HRAS, IKBKB, IKBKG, ITK, JAK2, JAK3,
KRAS, LYN, MAP2K1, MAPK1, MAPK3, NCF1, NFKB1, NFKBIA, NFKBIB, NRAS, PAK1, PF4, PIK3CA, PIK3CB, PIK3CD, PIK3CG, PLCB2, PPBP, PREX1, PRKCB,
PRKCD, PRKCZ, PTK2, PTK2B, PXN, RAC1, RAC2, RAC3, RAF1, RELA, RHOA, ROCK1, ROCK2, SHC1, SOS1, SRC, STAT1, STAT2, STAT3, STAT5B, VAV1, VAV2,
WAS, WASL, XCR1
Complement and coagulation cascades (hsa04610) A2M, C1QA, C1QB, C1QC, C1R, C1S, C2, C3, C3AR1, C4A, C4B, C4BPA, C5, C5AR1, C6, C7, C9, CD46, CD55, CD59, CFB, CFD, CFH, CFI, CR1, CR2, F11, F12, F2,
F2R, F8, FGA, FGB, FGG, ITGAM, ITGAX, ITGB2, KLKB1, MASP1, MASP2, MBL2, PLAT, PLAU, PLG, PROS1, SERPINA1, SERPIND1, SERPINE1, SERPING1, VWF
C-type lectin receptor signaling pathway (hsa04625) AKT1, ARHGEF12, BCL10, BCL3, CARD9, CASP1, CASP8, CBLB, CD209, CHUK, CLEC7A, CYLD, FCER1G, HRAS, IKBKB, IKBKE, IKBKG, IL10, IL12A, IL12B, IL1B, IL2,
IL23A, IL6, IRF1, IRF9, ITPR1, JUN, KRAS, LSP1, MALT1, MAP3K14, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK8, MAPK9,
MAPKAPK2, MDM2, NFATC1, NFATC2, NFATC3, NFATC4, NFKB1, NFKB2, NFKBIA, NLRP3, NRAS, PAK1, PIK3CA, PIK3CB, PIK3CD, PLCG2, PPP3CA, PPP3CB,
PPP3CC, PPP3R1, PPP3R2, PRKCD, PTGS2, PTPN11, PYCARD, RAF1, RELA, RELB, RHOA, SRC, STAT1, STAT2, SYK, TNF
Cytokine-cytokine receptor interaction (hsa04060) ACKR3, ACVR1, ACVR1B, ACVR2A, ACVR2B, AMH, AMHR2, BMPR1A, BMPR1B, BMPR2, CCL1, CCL11, CCL13, CCL19, CCL2, CCL21, CCL24, CCL25, CCL26, CCL3,
CCL3L3, CCL4, CCL5, CCL7, CCR1, CCR10, CCR2, CCR3, CCR4, CCR5, CCR6, CCR7, CCR8, CCR9, CD27, CD4, CD40, CD40LG, CD70, CLCF1, CNTF, CNTFR, CSF1,
CSF1R, CSF2, CSF2RB, CSF3, CSF3R, CTF1, CX3CL1, CX3CR1, CXCL1, CXCL10, CXCL11, CXCL12, CXCL13, CXCL14, CXCL16, CXCL2, CXCL3, CXCL5, CXCL8, CXCL9,
CXCR1, CXCR2, CXCR3, CXCR4, CXCR5, CXCR6, EDA, EDA2R, EDAR, EPO, EPOR, FAS, FASLG, GH1, GHR, IFNA1, IFNA10, IFNA13, IFNA14, IFNA16, IFNA17,
IFNA2, IFNA21, IFNA4, IFNA5, IFNA6, IFNA7, IFNA8, IFNAR1, IFNAR2, IFNB1, IFNE, IFNG, IFNGR1, IFNGR2, IFNK, IFNL1, IFNLR1, IFNW1, IL10, IL10RA, IL10RB,
IL11, IL11RA, IL12A, IL12B, IL12RB1, IL12RB2, IL13, IL13RA1, IL13RA2, IL15, IL15RA, IL17A, IL17B, IL17F, IL17RA, IL17RB, IL17RC, IL18, IL18R1, IL18RAP, IL19,
IL1A, IL1B, IL1R1, IL1R2, IL1RAP, IL1RL1, IL1RL2, IL1RN, IL2, IL20, IL20RA, IL20RB, IL21, IL21R, IL22, IL22RA1, IL23A, IL23R, IL24, IL25, IL26, IL27, IL27RA,
IL2RA, IL2RB, IL2RG, IL3, IL32, IL4, IL4R, IL5, IL5RA, IL6, IL6R, IL6ST, IL7, IL7R, IL9, LEP, LEPR, LIF, LIFR, LTA, LTB, LTBR, MPL, NGF, NGFR, OSM, OSMR, PF4,
PPBP, PRL, PRLR, RELT, TGFB1, TGFB2, TGFB3, TGFBR1, TGFBR2, TNF, TNFRSF11A, TNFRSF11B, TNFRSF12A, TNFRSF13B, TNFRSF13C, TNFRSF14, TNFRSF17,
TNFRSF18, TNFRSF19, TNFRSF1A, TNFRSF1B, TNFRSF21, TNFRSF25, TNFRSF4, TNFRSF6B, TNFRSF8, TNFRSF9, TNFSF10, TNFSF11, TNFSF12, TNFSF13,
TNFSF13B, TNFSF14, TNFSF15, TNFSF18, TNFSF4, TNFSF8, TNFSF9, TSLP, XCR1
Cytosolic DNA-sensing pathway (hsa04623) ADAR, AIM2, CASP1, CCL4, CCL5, CHUK, CXCL10, DDX58, IFNA1, IFNA10, IFNA13, IFNA14, IFNA16, IFNA17, IFNA2, IFNA21, IFNA4, IFNA5, IFNA6, IFNA7,
IFNA8, IFNB1, IKBKB, IKBKE, IKBKG, IL18, IL1B, IL6, IRF3, IRF7, MAVS, NFKB1, NFKBIA, NFKBIB, PYCARD, RELA, RIPK1, TBK1, TREX1, ZBP1
ErbB signaling pathway (hsa04012) ABL1, AKT1, BAD, BRAF, CBL, CBLB, CDKN1B, CRK, CRKL, EGF, EGFR, EIF4EBP1, ELK1, ERBB2, ERBB3, GAB1, GRB2, GSK3B, HBEGF, HRAS, JUN, KRAS,
MAP2K1, MAP2K2, MAP2K4, MAP2K7, MAPK1, MAPK10, MAPK3, MAPK8, MAPK9, MTOR, MYC, NCK1, NRAS, NRG1, PAK1, PAK2, PIK3CA, PIK3CB, PIK3CD,
PLCG1, PLCG2, PRKCA, PRKCB, PTK2, RAF1, RPS6KB1, SHC1, SOS1, SRC, STAT5A, STAT5B, TGFA
Fc epsilon RI signaling pathway (hsa04664) AKT1, BTK, CSF2, FCER1A, FCER1G, FYN, GAB2, GRB2, HRAS, IL13, IL3, IL4, IL5, INPP5D, KRAS, LAT, LCP2, LYN, MAP2K1, MAP2K2, MAP2K3, MAP2K4,
MAP2K6, MAP2K7, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK8, MAPK9, MS4A2, NRAS, PIK3CA, PIK3CB, PIK3CD, PLCG1,
PLCG2, PRKCA, RAC1, RAC2, RAC3, RAF1, SOS1, SYK, TNF, VAV1, VAV2
Fc gamma R-mediated phagocytosis (hsa04666) AKT1, AMPH, ARF6, ASAP1, ASAP2, CDC42, CFL1, CRK, CRKL, DNM2, FCGR1A, FCGR2A, FCGR2B, FCGR3B, GAB2, GSN, HCK, INPP5D, LAT, LYN, MAP2K1,
MAPK1, MAPK3, NCF1, PAK1, PIK3CA, PIK3CB, PIK3CD, PIP5K1A, PIP5K1C, PLA2G6, PLCG1, PLCG2, PRKCA, PRKCB, PRKCD, PRKCE, PTPRC, RAC1, RAC2, RAF1,
RPS6KB1, SYK, VASP, VAV1, VAV2, WAS, WASL
FoxO signaling pathway (hsa04068) AKT1, ATM, BCL2L11, BCL6, BRAF, CAT, CCND1, CDK2, CDKN1B, CDKN2B, CHUK, CREBBP, EGF, EGFR, EP300, FASLG, FOXG1, FOXO1, FOXO3, GADD45B,
GADD45G, GRB2, HOMER2, HOMER3, HRAS, IGF1, IKBKB, IL10, IL6, IL7R, INS, INSR, IRS1, KRAS, MAP2K1, MAP2K2, MAPK1, MAPK10, MAPK11, MAPK12,
MAPK13, MAPK14, MAPK3, MAPK8, MAPK9, MDM2, NRAS, PIK3CA, PIK3CB, PIK3CD, PTEN, RAF1, RAG1, RAG2, SMAD3, SMAD4, SOD2, SOS1, STAT3,
TGFB1, TGFB2, TGFB3, TGFBR1, TGFBR2, TNFSF10
Hematopoietic cell lineage (hsa04640) CD14, CD19, CD1A, CD1B, CD1C, CD1D, CD22, CD36, CD38, CD3D, CD3E, CD4, CD44, CD5, CD55, CD59, CD8A, CD8B, CR1, CR2, CSF1, CSF1R, CSF2, CSF3,
CSF3R, DNTT, EPO, EPOR, FCER2, FCGR1A, FLT3, FLT3LG, GYPA, HLA-DOA, HLA-DOB, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQA2, HLA-DQB1, HLA-DRA,
HLA-DRB1, HLA-DRB5, IL11, IL11RA, IL1A, IL1B, IL1R1, IL1R2, IL2RA, IL3, IL4, IL4R, IL5, IL5RA, IL6, IL6R, IL7, IL7R, ITGA2, ITGA3, ITGA4, ITGA5, ITGAM, KIT,
KITLG, TFRC, TNF
HIF-1 signaling pathway (hsa04066) AKT1, ANGPT1, ANGPT2, ARNT, BCL2, CDKN1B, CREBBP, CYBB, EGF, EGFR, EIF4E, EIF4EBP1, ENO1, ENO2, ENO3, EP300, EPO, ERBB2, FLT1, HIF1A, HMOX1,
IFNG, IFNGR1, IFNGR2, IGF1, IL6, IL6R, INS, INSR, LTBR, MAP2K1, MAP2K2, MAPK1, MAPK3, MTOR, NFKB1, NOS2, PDHB, PDK1, PFKL, PFKM, PGK1, PIK3CA,
PIK3CB, PIK3CD, PLCG1, PLCG2, PRKCA, PRKCB, RELA, RPS6KB1, SERPINE1, SLC2A1, STAT3, TEK, TF, TFRC, TIMP1, TLR4, VEGFA
IL-17 signaling pathway (hsa04657) CASP3, CASP8, CCL11, CCL2, CCL7, CEBPB, CHUK, CSF2, CSF3, CXCL1, CXCL10, CXCL2, CXCL3, CXCL5, CXCL8, DEFB4A, FADD, FOS, GSK3B, IFNG, IKBKB, IKBKE,
IKBKG, IL13, IL17A, IL17B, IL17F, IL17RA, IL17RB, IL17RC, IL1B, IL25, IL4, IL5, IL6, JUN, JUND, MAP3K7, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13,
MAPK14, MAPK3, MAPK7, MAPK8, MAPK9, MMP1, MMP13, MMP3, MMP9, MUC5B, NFKB1, NFKBIA, PTGS2, RELA, TAB2, TAB3, TBK1, TNF, TNFAIP3,
TRADD, TRAF2, TRAF3, TRAF3IP2, TRAF5, TRAF6
Intestinal immune network for IgA production (hsa04672) AICDA, CCL25, CCR10, CCR9, CD28, CD40, CD40LG, CD80, CD86, CXCL12, CXCR4, HLA-DOA, HLA-DOB, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQA2, HLA-
DQB1, HLA-DRA, HLA-DRB1, HLA-DRB5, ICOS, ICOSLG, IL10, IL15, IL15RA, IL2, IL4, IL5, IL6, ITGA4, LTBR, MAP3K14, TGFB1, TNFRSF13B, TNFRSF13C,
TNFRSF17, TNFSF13, TNFSF13B
JAK-STAT signaling pathway (hsa04630) AKT1, BCL2, BCL2L1, CCND1, CISH, CNTF, CNTFR, CREBBP, CSF2, CSF2RB, CSF3, CSF3R, CTF1, EGF, EGFR, EP300, EPO, EPOR, GH1, GHR, GRB2, HRAS, IFNA1,
IFNA10, IFNA13, IFNA14, IFNA16, IFNA17, IFNA2, IFNA21, IFNA4, IFNA5, IFNA6, IFNA7, IFNA8, IFNAR1, IFNAR2, IFNB1, IFNE, IFNG, IFNGR1, IFNGR2, IFNK,
IFNL1, IFNLR1, IFNW1, IL10, IL10RA, IL10RB, IL11, IL11RA, IL12A, IL12B, IL12RB1, IL12RB2, IL13, IL13RA1, IL13RA2, IL15, IL15RA, IL19, IL2, IL20, IL20RA,
IL20RB, IL21, IL21R, IL22, IL22RA1, IL22RA2, IL23A, IL23R, IL24, IL27RA, IL2RA, IL2RB, IL2RG, IL3, IL4, IL4R, IL5, IL5RA, IL6, IL6R, IL6ST, IL7, IL7R, IL9, IRF9,
JAK1, JAK2, JAK3, LEP, LEPR, LIF, LIFR, MPL, MTOR, MYC, OSM, OSMR, PDGFRA, PDGFRB, PIAS1, PIK3CA, PIK3CB, PIK3CD, PIM1, PRL, PRLR, PTPN11, PTPN2,
PTPN6, RAF1, SOCS1, SOCS2, SOCS3, SOCS6, SOS1, STAM, STAT1, STAT2, STAT3, STAT4, STAT5A, STAT5B, STAT6, TSLP, TYK2
Leukocyte transendothelial migration (hsa04670) ACTN1, ACTN4, AFDN, CDC42, CDH5, CLDN1, CLDN19, CLDN5, CTNNA1, CTNNB1, CTNND1, CXCL12, CXCR4, CYBA, CYBB, EZR, F11R, ICAM1, ITGA4, ITGAL,
ITGAM, ITGB1, ITGB2, ITK, JAM2, JAM3, MAPK11, MAPK12, MAPK13, MAPK14, MMP2, MMP9, NCF1, NCF2, NCF4, PECAM1, PIK3CA, PIK3CB, PIK3CD,
PLCG1, PLCG2, PRKCA, PRKCB, PTK2, PTK2B, PTPN11, PXN, RAC1, RAC2, RASSF5, RHOA, RHOH, ROCK1, ROCK2, SIPA1, THY1, VASP, VAV1, VAV2, VCAM1,
VCL
MAPK signaling pathway (hsa04010) AKT1, ANGPT1, ANGPT2, ATF2, BRAF, CACNA1H, CACNA1S, CACNA2D2, CASP3, CD14, CDC42, CHUK, CRK, CRKL, CSF1, CSF1R, DAXX, DDIT3, DUSP1, ECSIT,
EGF, EGFR, ELK1, ERBB2, ERBB3, FAS, FASLG, FGF10, FGF7, FLNB, FLT1, FLT3, FLT3LG, FLT4, FOS, GADD45B, GADD45G, GNA12, GRB2, HGF, HRAS, HSPA6,
HSPA8, IGF1, IKBKB, IKBKG, IL1A, IL1B, IL1R1, IL1RAP, INS, INSR, IRAK1, IRAK4, JUNE, JUND, KDR, KIT, KITLG, KRAS, LAMTOR3, MAP2K1, MAP2K2, MAP2K3,
MAP2K4, MAP2K5, MAP2K6, MAP2K7, MAP3K1, MAP3K14, MAP3K2, MAP3K3, MAP3K4, MAP3K5, MAP3K6, MAP3K7, MAP3K8, MAP4K1, MAP4K3,
MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK7, MAPK8, MAPK9, MAPKAPK2, MAPKAPK5, MAX, MET, MYC, MYD88, NFATC1,
NFATC3, NFKB1, NFKB2, NGF, NGFR, NR4A1, NRAS, NTF3, NTRK1, PAK1, PAK2, PDGFRA, PDGFRB, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PPP3R2, PRKCA,
PRKCB, PTPN7, RAC1, RAC2, RAC3, RAF1, RASA1, RASGRP1, RASGRP3, RELA, RELB, RPS6KA1, RPS6KA4, SOS1, SRF, TAB1, TAB2, TEK, TGFA, TGFB1, TGFB2,
TGFB3, TGFBR1, TGFBR2, TNF, TNFRSF1A, TP53, TRADD, TRAF2, TRAF6, VEGFA, VEGFB, VEGFC, VEGFD
Natural killer cell mediated cytotoxicity (hsa04650) BID, BRAF, CASP3, CD244, CD247, CD48, CSF2, FAS, FASLG, FCER1G, FCGR3B, FYN, GRB2, GZMB, HCST, HLA-A, HLA-B, HLA-C, HLA-E, HRAS, ICAM1, IFNA1,
IFNA10, IFNA13, IFNA14, IFNA16, IFNA17, IFNA2, IFNA21, IFNA4, IFNA5, IFNA6, IFNA7, IFNA8, IFNAR1, IFNAR2, IFNB1, IFNG, IFNGR1, IFNGR2, ITGAL, ITGB2,
KIR2DL1, KIR2DL3, KLRC1, KLRC2, KLRC3, KLRD1, KLRK1, KRAS, LAT, LCK, LCP2, MAP2K1, MAP2K2, MAPK1, MAPK3, MICA, MICB, NCR1, NCR2, NCR3,
NFATC1, NFATC2, NRAS, PAK1, PIK3CA, PIK3CB, PIK3CD, PLCG1, PLCG2, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PPP3R2, PRF1, PRKCA, PRKCB, PTK2B, PTPN11,
PTPN6, RAC1, RAC2, RAC3, RAET1L, RAF1, SH2D1A, SH2D1B, SH3BP2, SHC1, SOS1, SYK, TNF, TNFSF10, TYROBP, ULBP3, VAV1, VAV2, ZAP70
Neurotrophin signaling pathway (hsa04722) ABL1, AKT1, BAD, BAX, BCL2, BRAF, CDC42, CRK, CRKL, FASLG, FOXO3, GAB1, GRB2, GSK3B, HRAS, IKBKB, IRAK1, IRAK2, IRAK3, IRAK4, IRS1, JUN, KRAS,
MAP2K1, MAP2K2, MAP2K5, MAP2K7, MAP3K1, MAP3K3, MAP3K5, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK7, MAPK8,
MAPK9, MAPKAPK2, NFKB1, NFKBIA, NFKBIB, NGF, NGFR, NRAS, NTF3, NTRK1, PIK3CA, PIK3CB, PIK3CD, PLCG1, PLCG2, PRKCD, PTPN11, RAC1, RAF1,
RAPGEF1, RELA, RHOA, RIPK2, RPS6KA1, SH2B1, SH2B3, SHC1, SOS1, TP53, TRAF6
NF-kappa B signaling pathway (hsa04064) ATM, BCL10, BCL2, BCL2A1, BCL2L1, BIRC2, BIRC3, BLNK, BTK, CARD10, CARD11, CCL13, CCL19, CCL21, CCL4, CD14, CD40, CD40LG, CFLAR, CHUK, CXCL1,
CXCL12, CXCL2, CXCL3, CXCL8, CYLD, DDX58, EDA, EDA2R, EDAR, GADD45B, ICAM1, IKBKB, IKBKG, IL1B, IL1R1, IRAK1, IRAK4, LAT, LBP, LCK, LTA, LTB, LTBR,
LY96, LYN, MALT1, MAP3K14, MAP3K7, MYD88, NFKB1, NFKB2, NFKBIA, PARP1, PLAU, PLCG1, PLCG2, PRKCB, PRKCQ, PTGS2, RELA, RELB, RIPK1, SYK, TAB1,
TAB2, TAB3, TICAM1, TICAM2, TIRAP, TLR4, TNF, TNFAIP3, TNFRSF11A, TNFRSF13C, TNFRSF1A, TNFSF11, TNFSF13B, TNFSF14, TRADD, TRAF1, TRAF2,
TRAF3, TRAF5, TRAF6, TRIM25, VCAM1, ZAP70
NOD-like receptor signaling pathway (hsa04621) AIM2, ANTXR2, ATG16L1, BCL2, BCL2L1, BIRC2, BIRC3, CARD9, CASP1, CASP5, CASP8, CASR, CCL2, CCL5, CHUK, CXCL1, CXCL2, CXCL3, CXCL8, CYBA, CYBB,
DEFA1, DEFB4A, DNM1L, FADD, IFI16, IFNA1, IFNA10, IFNA13, IFNA14, IFNA16, IFNA17, IFNA2, IFNA21, IFNA4, IFNA5, IFNA6, IFNA7, IFNA8, IFNAR1,
IFNAR2, IFNB1, IKBKB, IKBKE, IKBKG, IL18, IL1B, IL6, IRAK4, IRF3, IRF7, IRF9, ITPR1, JAK1, JUN, MAP3K7, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13,
MAPK14, MAPK3, MAPK8, MAPK9, MAVS, MEFV, MYD88, NAIP, NFKB1, NFKBIA, NFKBIB, NLRC4, NLRP1, NLRP12, NLRP3, NLRP7, NLRX1, NOD1, NOD2,
OAS1, OAS2, PLCB2, PRKCD, PSTPIP1, PYCARD, RELA, RHOA, RIPK1, RIPK2, RNASEL, STAT1, STAT2, TAB1, TAB2, TAB3, TANK, TBK1, TICAM1, TLR4, TNF,
TNFAIP3, TRAF2, TRAF3, TRAF5, TRAF6, TRPM7, TXN, TYK2
Osteoclast differentiation (hsa04380) ACP5, AKT1, BLNK, BTK, CHUK, CREB1, CSF1, CSF1R, CTSK, CYBA, CYLD, FCGR1A, FCGR2A, FCGR2B, FCGR3B, FOS, FOSL2, FYN, GAB2, GRB2, IFNAR1, IFNAR2,
IFNB1, IFNG, IFNGR1, IFNGR2, IKBKB, IKBKG, IL1A, IL1B, IL1R1, IRF9, JAK1, JUN, JUNB, JUND, LCK, LCP2, LILRB1, LILRB4, MAP2K1, MAP2K6, MAP2K7,
MAP3K14, MAP3K7, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK8, MAPK9, MITF, NCF1, NCF2, NCF4, NFATC1, NFATC2, NFKB1,
NFKB2, NFKBIA, NOX1, OSCAR, PIK3CA, PIK3CB, PIK3CD, PLCG2, PPARG, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PPP3R2, RAC1, RELA, RELB, SIRPA, SOCS1,
SOCS3, SPI1, STAT1, STAT2, SYK, TAB1, TAB2, TGFB1, TGFB2, TGFBR1, TGFBR2, TNF, TNFRSF11A, TNFRSF11B, TNFRSF1A, TNFSF11, TRAF2, TRAF6, TREM2,
TYK2, TYROBP
Prolactin signaling pathway (hsa04917) AKT1, CCND1, CGA, CISH, CSN2, CYP17A1, ESR1, ESR2, FOS, FOXO3, GRB2, GSK3B, HRAS, INS, IRF1, JAK2, KRAS, MAP2K1, MAP2K2, MAPK1, MAPK10,
MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK8, MAPK9, NFKB1, NRAS, PIK3CA, PIK3CB, PIK3CD, PRL, PRLR, RAF1, RELA, SHC1, SOCS1, SOCS2,
SOCS3, SOCS6, SOS1, SRC, STAT1, STAT3, STAT5A, STAT5B, TH, TNFRSF11A, TNFSF11
RIG-I-like receptor signaling pathway (hsa04622) AZI2, CASP10, CASP8, CHUK, CXCL10, CXCL8, CYLD, DDX58, DHX58, FADD, IFIH1, IFNA1, IFNA10, IFNA13, IFNA14, IFNA16, IFNA17, IFNA2, IFNA21, IFNA4,
IFNA5, IFNA6, IFNA7, IFNA8, IFNB1, IFNE, IFNK, IFNW1, IKBKB, IKBKE, IKBKG, IL12A, IL12B, IRF3, IRF7, ISG15, MAP3K1, MAP3K7, MAPK10, MAPK11,
MAPK12, MAPK13, MAPK14, MAPK8, MAPK9, MAVS, NFKB1, NFKBIA, NFKBIB, NLRX1, PIN1, RELA, RIPK1, SIKE1, TANK, TBK1, TBKBP1, TNF, TRADD, TRAF2,
TRAF3, TRAF6, TRIM25
T cell receptor signaling pathway (hsa04660) AKT1, BCL10, CARD11, CBLB, CD247, CD28, CD3D, CD3E, CD4, CD40LG, CD8A, CD8B, CDC42, CDK4, CHUK, CSF2, CTLA4, FOS, FYN, GRAP2, GRB2, GSK3B,
HRAS, ICOS, IFNG, IKBKB, IKBKG, IL10, IL2, IL4, IL5, ITK, JUN, KRAS, LAT, LCK, LCP2, MALT1, MAP2K1, MAP2K2, MAP2K7, MAP3K14, MAP3K7, MAP3K8,
MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK8, MAPK9, NCK1, NFATC1, NFATC2, NFATC3, NFKB1, NFKBIA, NFKBIB, NRAS, PAK1,
PAK2, PIK3CA, PIK3CB, PIK3CD, PLCG1, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PPP3R2, PRKCQ, PTPN6, PTPRC, RAF1, RASGRP1, RELA, RHOA, SOS1, TNF, VAV1,
VAV2, ZAP70
Th1 and Th2 cell differentiation (hsa04658) CD247, CD3D, CD3E, CD4, CHUK, FOS, GATA3, HLA-DOA, HLA-DOB, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQA2, HLA-DQB1, HLA-DRA, HLA-DRB1, HLA-
DRB5, IFNG, IFNGR1, IFNGR2, IKBKB, IKBKG, IL12A, IL12B, IL12RB1, IL12RB2, IL13, IL2, IL2RA, IL2RB, IL2RG, IL4, IL4R, IL5, JAK1, JAK2, JAK3, JUN, LAT, LCK,
MAF, MAML1, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK8, MAPK9, NFATC1, NFATC2, NFATC3, NFKB1, NFKBIA, NFKBIB,
NOTCH1, NOTCH2, NOTCH3, PLCG1, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PPP3R2, PRKCQ, RBPJ, RELA, RUNX3, STAT1, STAT4, STAT5A, STAT5B, STAT6,
TBX21, TYK2, ZAP70
Th17 cell differentiation (hsa04659) AHR, CD247, CD3D, CD3E, CD4, CHUK, FOS, FOXP3, GATA3, HIF1A, HLA-DOA, HLA-DOB, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQA2, HLA-DQB1, HLA-DRA,
HLA-DRB1, HLA-DRB5, IFNG, IFNGR1, IFNGR2, IKBKB, IKBKG, IL12RB1, IL17A, IL17F, IL1B, IL1R1, IL1RAP, IL2, IL21, IL21R, IL22, IL23A, IL23R, IL27RA, IL2RA,
IL2RB, IL2RG, IL4, IL4R, IL6, IL6R, IL6ST, IRF4, JAK1, JAK2, JAK3, JUN, LAT, LCK, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK8,
MAPK9, MTOR, NFATC1, NFATC2, NFATC3, NFKB1, NFKBIA, NFKBIB, PLCG1, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PPP3R2, PRKCQ, RARA, RELA, RORA, RORC,
RUNX1, RXRA, SMAD2, SMAD4, STAT1, STAT3, STAT5A, STAT5B, STAT6, TBX21, TGFB1, TGFBR1, TGFBR2, TYK2, ZAP70
TNF signaling pathway (hsa04668) AKT1, ATF2, BAG4, BCL3, BIRC2, BIRC3, CASP10, CASP3, CASP8, CCL2, CCL5, CEBPB, CFLAR, CHUK, CREB1, CSF1, CSF2, CX3CL1, CXCL1, CXCL10, CXCL2, CXCL3,
CXCL5, DNM1L, FADD, FAS, FOS, ICAM1, IFNB1, IKBKB, IKBKG, IL15, IL18R1, IL1B, IL6, IRF1, JUN, JUNB, LIF, LTA, MAP2K1, MAP2K3, MAP2K4, MAP2K6,
MAP2K7, MAP3K14, MAP3K5, MAP3K7, MAP3K8, MAPK1, MAPK10, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPK8, MAPK9, MMP14, MMP3,
MMP9, NFKB1, NFKBIA, NOD2, PIK3CA, PIK3CB, PIK3CD, PTGS2, RELA, RIPK1, RPS6KA4, SELE, SOCS3, TAB1, TAB2, TAB3, TNF, TNFAIP3, TNFRSF1A,
TNFRSF1B, TRADD, TRAF1, TRAF2, TRAF3, TRAF5, VCAM1, VEGFC, VEGFD
Toll-like receptor signaling pathway (hsa04620) AKT1, CASP8, CCL3, CCL3L3, CCL4, CCL5, CD14, CD40, CD80, CD86, CHUK, CTSK, CXCL10, CXCL11, CXCL8, CXCL9, FADD, FOS, IFNA1, IFNA10, IFNA13, IFNA14,
IFNA16, IFNA17, IFNA2, IFNA21, IFNA4, IFNA5, IFNA6, IFNA7, IFNA8, IFNAR1, IFNAR2, IFNB1, IKBKB, IKBKE, IKBKG, IL12A, IL12B, IL1B, IL6, IRAK1, IRAK4,
IRF3, IRF5, IRF7, JUN, LBP, LY96, MAP2K1, MAP2K2, MAP2K3, MAP2K4, MAP2K6, MAP2K7, MAP3K7, MAP3K8, MAPK1, MAPK10, MAPK11, MAPK12,
MAPK13, MAPK14, MAPK3, MAPK8, MAPK9, MYD88, NFKB1, NFKBIA, PIK3CA, PIK3CB, PIK3CD, RAC1, RELA, RIPK1, SPP1, STAT1, TAB1, TAB2, TBK1,
TICAM1, TICAM2, TIRAP, TLR1, TLR2, TLR3, TLR4, TLR5, TLR6, TLR7, TLR8, TLR9, TNF, TOLLIP, TRAF3, TRAF6
VEGF signaling pathway (hsa04370) AKT1, BAD, CASP9, CDC42, HRAS, KDR, KRAS, MAP2K1, MAP2K2, MAPK1, MAPK11, MAPK12, MAPK13, MAPK14, MAPK3, MAPKAPK2, NFATC2, NRAS,
PIK3CA, PIK3CB, PIK3CD, PLCG1, PLCG2, PPP3CA, PPP3CB, PPP3CC, PPP3R1, PPP3R2, PRKCA, PRKCB, PTGS2, PTK2, PXN, RAC1, RAC2, RAC3, RAF1, SRC,
VEGFA
Viral protein interaction with cytokine and cytokine receptor ACKR3, CCL1, CCL11, CCL13, CCL19, CCL2, CCL21, CCL24, CCL25, CCL26, CCL3, CCL3L3, CCL4, CCL5, CCL7, CCR1, CCR10, CCR2, CCR3, CCR4, CCR5, CCR6, CCR7,
(hsa04061) CCR8, CCR9, CSF1, CSF1R, CX3CL1, CX3CR1, CXCL1, CXCL10, CXCL11, CXCL12, CXCL13, CXCL14, CXCL2, CXCL3, CXCL5, CXCL8, CXCL9, CXCR1, CXCR2, CXCR3,
CXCR4, CXCR5, IL10, IL10RA, IL10RB, IL18, IL18R1, IL18RAP, IL19, IL2, IL20, IL20RA, IL20RB, IL22RA1, IL24, IL2RA, IL2RB, IL2RG, IL6, IL6R, IL6ST, LTA, LTBR,
PF4, PPBP, TNF, TNFRSF14, TNFRSF1A, TNFRSF1B, TNFSF10, TNFSF14, XCR1
Supplementary Table S3. SLE case-control pathway based aggregate association analysis result (SKAT-O).
Ratio genes on
Genes in Genes on array/ genes in SNVs in
Pathway (KEGG ID) pathway array pathway test P-value FDR
Th1 and Th2 cell differentiation (hsa04658) 92 78 0.85 14362 6.3E-11 2.2E-09
Antigen processing and presentation (hsa04612) 77 40 0.52 8017 1.8E-10 3.1E-09
Hematopoietic cell lineage (hsa04640) 97 71 0.73 13013 3.8E-10 4.5E-09
Th17 cell differentiation (hsa04659) 107 96 0.90 19347 1.7E-09 1.5E-08
Intestinal immune network for IgA production (hsa04672) 49 39 0.80 7909 3.4E-08 2.4E-07
Natural killer cell mediated cytotoxicity (hsa04650) 131 100 0.76 15821 4.7E-06 2.8E-05
TNF signaling pathway (hsa04668) 112 88 0.79 12639 1.9E-05 9.4E-05
JAK-STAT signaling pathway (hsa04630) 162 133 0.82 18003 7.4E-05 0.00032
RIG-I-like receptor signaling pathway (hsa04622) 70 63 0.90 8459 0.00021 0.00080
NOD-like receptor signaling pathway (hsa04621) 178 109 0.61 15729 0.00031 0.0011
Complement and coagulation cascades (hsa04610) 79 50 0.63 7112 0.00041 0.0013
Toll-like receptor signaling pathway (hsa04620) 104 96 0.92 12178 0.00080 0.0022
Cytokine-cytokine receptor interaction (hsa04060) 294 221 0.75 26771 0.00083 0.0022
C-type lectin receptor signaling pathway (hsa04625) 104 75 0.72 12986 0.0020 0.0050
IL-17 signaling pathway (hsa04657) 93 68 0.73 9358 0.0043 0.0100
Fc epsilon RI signaling pathway (hsa04664) 68 51 0.75 8514 0.0052 0.011
Viral protein interaction with cytokine and cytokine receptor
(hsa04061) 100 75 0.75 8435 0.0062 0.013
NF-kappa B signaling pathway (hsa04064) 102 88 0.86 14349 0.0078 0.015
Osteoclast differentiation (hsa04380) 128 101 0.79 18602 0.013 0.023
T cell receptor signaling pathway (hsa04660) 103 85 0.83 14268 0.014 0.025
Cytosolic DNA-sensing pathway (hsa04623) 63 40 0.63 4993 0.015 0.025
B cell receptor signaling pathway (hsa04662) 82 61 0.74 11330 0.037 0.059
HIF-1 signaling pathway (hsa04066) 109 60 0.55 11748 0.043 0.065
Fc gamma R-mediated phagocytosis (hsa04666) 94 48 0.51 9608 0.055 0.080
Apoptosis (hsa04210) 136 78 0.57 12115 0.066 0.092
Prolactin signaling pathway (hsa04917) 70 51 0.73 7599 0.15 0.20
MAPK signaling pathway (hsa04010) 295 151 0.51 27384 0.16 0.20
Chemokine signaling pathway (hsa04062) 190 111 0.58 15084 0.16 0.20
Neurotrophin signaling pathway (hsa04722) 119 70 0.59 12388 0.18 0.22
VEGF signaling pathway (hsa04370) 59 39 0.66 7440 0.19 0.22
ErbB signaling pathway (hsa04012) 85 54 0.64 11201 0.28 0.31
FoxO signaling pathway (hsa04068) 132 67 0.51 10643 0.29 0.31
Adherens junction (hsa04520) 72 42 0.58 9633 0.29 0.31
Apoptosis - multiple species (hsa04215) 33 20 0.61 3722 0.31 0.32
Leukocyte transendothelial migration (hsa04670) 112 64 0.57 11444 0.35 0.35
FDR: False Discovery Rate
Supplementary Table S4. Pathway SLE polygenic risk scores, SLE patients positive for each pathway. The
threshold for the pathway PRS was set at the 97.5th percentile in control individuals.
Supplementary Table S5. Analysed gene-sets and their genes. Only sequenced genes are listed.
Complement C1QA, C1QB, C1QC, C1R, C1S, C2, C3, C3AR1, C4A, C4B, C4BPA, C5, C5AR1, C6, C7, Complement subset of the KEGG Complement and coagulation
C9, CD46, CD55, CD59, CFB, CFD, CFH, CFI, CR1, CR2, ITGAM, ITGAX, ITGB2, MASP1, cascades pathway (hsa04610)
MASP2, MBL2, SERPING1
Monogenic SLE ACP5, ADAR, C1QA, C1QB, C1QC, C1R, C1S, C2, C3, C4A, C4B, CYBB, DDX58, Tsokos et al., 2016 (PMID:27872476)
DNASE1L3, FAS, FASLG, IFIH1, ISG15, PRKCD, PSMB8, PTPN11, RAG1, RAG2, TREX1
Supplementary Table S6. Top gene regions from aggregate testing (SKAT-O). Potentially novel loci are indicated in bold and
only gene regions with FDR 0.05 or less are listed.
Non-
Gene N Conserved
Chr Start bp End bp synonymous Top SNV
d
FDR P-value
regiona variants b variantsc
variants
TNXB 6 31908832 32177251 362 43 102 rs369580 1.7E-13 9.1E-17
PABPC4 1 39926385 40142621 132 10 45 rs879037 6.6E-06 4.3E-08
IFNK 9 27424212 27626596 231 7 58 rs895023 0.00081 1.2E-05
ATF2 2 175836878 176133034 166 4 64 rs79907460 0.0091 0.00016
SLC30A6 2 32290810 32549548 160 14 33 rs2365556 0.025 0.00048
SERPING1 11 57264927 57482426 167 14 42 rs12801093 0.04 0.0008
SRD5A2 2 31649556 31906140 48 0 3 rs28383069 0.04 0.00083
SPAST 2 32188580 32482806 190 12 31 rs2365556 0.042 0.0009
MEMO1 2 31992779 32336221 110 0 21 rs533970 0.046 0.0011
Only the top gene in the extended MHC on chr 6 is listed. Associations surviving Bonferroni correction for multiple testing (1832 gene
regions) are above the line (P<2.7E-05). a) Gene regions included ±100kb to cover targeted regulatory regions; b) missense or nonsense
variants; c) constrained variants defined as GERP RS score >2; d) Top associated SNV from the SLE case-control single variant logistic
regression analysis. Bp: base pair, SNV: single nucleotide variant, FDR: False Discovery Rate.
Supplementary Table 7. Top gene regions from SLE case-control aggregate association testing
including functional annotation (GenePy). Gene regions with P-values surviving permutation
testing are listed.
Supplementary Table S8. Single variant SLE case-control association results Swedish cohort, independent loci with P<1x10-4.
CHR BP SNV Locus Minor/major allele MAF case MAF control OR FDR P
6 31708463 rs3131381 MSH5 A/C 0.248 0.113 2.82 6.2E-23 6.7E-28
6 32612110 rs9273058 HLA-DQA1 C/T 0.339 0.504 0.49 2.9E-20 1.3E-23
6 31464798 rs2534675 MICB T/C 0.449 0.292 2.02 6.0E-20 3.2E-23
7 128695983 rs13239597 IRF5/TNPO3 A/C 0.238 0.143 1.88 4.1E-11 1.1E-13
1 183542323 rs17849501 NCF2 T/C 0.108 0.053 2.33 3.3E-09 1.6E-11
2 191925424 rs3024859 STAT4 T/C 0.279 0.199 1.54 2.1E-06 1.4E-08
7 50308692 rs876037 IKZF1 A/T 0.242 0.312 0.66 3.4E-06 2.3E-08
16 11198932 rs35300161 CLEC16A T/C 0.133 0.188 0.63 0.00013 1.2E-06
2 191919354 rs11676659 STAT4 G/A 0.033 0.063 0.46 0.00018 1.6E-06
6 106594719 rs2179175 PRDM1 T/C 0.278 0.222 1.44 0.00036 3.5E-06
9 27483959 rs895023 MOB3B/IFNK G/A 0.026 0.056 0.45 0.00065 6.9E-06
12 129294244 rs11059926 SLC15A4 T/A 0.132 0.084 1.60 0.00088 9.5E-06
12 96374683 rs6538696 HAL T/C 0.533 0.467 1.34 0.0013 1.4E-05
2 30961022 rs55799526 CAPN13 G/C 0.057 0.094 0.58 0.0018 2.1E-05
4 102340309 rs10433984 BANK1 A/T 0.301 0.250 1.37 0.0026 3.0E-05
5 150442171 rs3792790 TNIP1 C/A 0.459 0.529 0.75 0.0027 3.2E-05
6 138230040 rs200820567 TNFAIP3 A/T 0.061 0.035 1.89 0.0034 4.2E-05
10 64427649 rs7075349 ZNF365/EGR2 G/A 0.311 0.372 0.75 0.0041 5.0E-05
6 15168274 rs9396569 JARID2 A/G 0.143 0.107 1.48 0.0057 7.2E-05
19 10526854 rs34953890 CDC37/TYK2 A/C 0.170 0.220 0.71 0.0070 9.2E-05
10 128935768 rs12263483 DOCK1/FAM196A G/A 0.101 0.142 0.67 0.0075 9.9E-05
Top SNP (or SNPs if independent) from each locus with at least one variant with P<1x10-4
Loci forwarded for replication are indcated in bold
Supplementary Table S9. Single variant SLE case-control association analysis in the Norwegian and Danish replication cohorts, and Sandinavian
cohorts meta-analysis.
Supplementary Table S10. Rare non-synonymous variants in genes for monogenic SLE and lupus-like disease.
Supplementary Table S11. SLE case-only non-synonymous variants. These are missense or nonsense SNVs observed in at least one SLE patient, but
not in the control population or in external sets of control individuals of similar ancestry (SweGen and GnomAD: European non-Finnish controls;
Ameur et al. 2017, Karczewski et al. 2019)
SLE (n=958)
Minor
Minor Major allele Minor allele Amino acid
CHR BP SNV allele allele count frequency Gene name Consequence change
11 1266280 rs773068050 C A 5 0.0028 MUC5B missense variant p.Thr2727Pro
1 186363103 chr1:186363103 A C 4 0.0023 C1orf27 missense variant p.Gln246Lys
1 151342270 rs772030489 T G 2 0.0011 SELENBP1 missense variant p.Pro36Thr
2 27455971 rs776014297 A T 2 0.0010 CAD missense variant p.Met922Lys
2 179698928 chr2:179698928 A G 2 0.0010 CCDC141 missense variant p.Ser1522Phe
9 16431447 chr9:16431447 A G 2 0.0011 BNC2 missense variant p.His307Tyr
9 21166175 rs779242420 C T 2 0.0010 IFNA21 missense variant p.Tyr146Cys
10 75583821 chr10:75583821 T G 2 0.0011 CAMK2G missense variant p.His370Asn
12 6458353 rs775543049 A G 2 0.0010 SCNN1A stop gained p.Arg551*
12 48482728 rs750735162 C T 2 0.0010 SENP1 missense variant p.Thr79Ala
12 56350882 chr12:56350882 T G 2 0.0010 PMEL missense variant p.Pro402His
12 129190793 chr12:129190793 G C 2 0.0011 TMEM132C missense variant p.Pro1094Ala
14 23057866 chr14:23057866 T A 2 0.0010 DAD1 missense variant p.Ser66Arg
15 91030272 rs181919733 A G 2 0.0010 IQGAP1 missense variant p.Val1371Met
17 41143320 chr17:41143320 A G 2 0.0011 RUNDC1 missense variant p.Val477Ile
19 4891395 rs139019426 C T 2 0.0010 ARRDC5 missense variant p.Gln231Arg
19 18273781 rs777121279 A G 2 0.0011 PIK3R2 missense variant p.Gly372Ser
missense variant &
19 55240959 rs764066889 A G 2 0.0010 KIR3DL3 splice region variant p.Gly219Asp
1 905700 rs759355675 G C 1 0.00056 PLEKHN1 missense variant p.Pro76Arg
stop gained & splice
1 1246066 chr1:1246066 T C 1 0.00055 PUSL1 region variant p.Gln233*
1 3739745 rs148840465 T C 1 0.00052 CEP104 missense variant p.Gly855Glu
1 6529607 rs781318885 C G 1 0.00055 PLEKHG5 missense variant p.Pro723Ala
1 6535129 chr1:6535129 A C 1 0.00055 PLEKHG5 missense variant p.Ala173Ser
1 6589075 rs147263684 C T 1 0.00052 NOL9 missense variant p.Ile602Val
1 7798073 rs778503877 G A 1 0.00052 CAMTA1 missense variant p.Lys1238Arg
1 7812557 rs369220323 A G 1 0.00052 CAMTA1 missense variant p.Arg1641Gln
1 7837316 chr1:7837316 T G 1 0.00052 VAMP3 missense variant p.Ala57Ser
1 7879456 chr1:7879456 C T 1 0.00052 PER3 missense variant p.Ile545Thr
1 8075450 chr1:8075450 G A 1 0.00052 ERRFI1 missense variant p.Phe46Leu
1 8399719 chr1:8399719 T G 1 0.00053 SLC45A1 missense variant p.Leu681Phe
1 8424007 rs778046482 C T 1 0.00056 RERE missense variant p.Lys134Glu
1 9786994 chr1:9786994 A G 1 0.00052 PIK3CD missense variant p.Glu1033Lys
1 10195185 rs138056371 G A 1 0.00052 UBE4B missense variant p.Asn722Ser
1 10386339 rs141942131 T C 1 0.00052 KIF1B missense variant p.Thr949Met
1 10523664 chr1:10523664 A G 1 0.00054 DFFA missense variant p.Ala152Val
1 11090845 chr1:11090845 C G 1 0.00052 MASP2 stop gained p.Tyr394*
1 11169772 chr1:11169772 G C 1 0.00052 MTOR missense variant p.Val2461Leu
1 11255025 chr1:11255025 G C 1 0.00052 ANGPTL7 missense variant p.Thr329Ser
1 12336842 chr1:12336842 A T 1 0.00052 VPS13D missense variant p.Val1066Asp
1 12337649 chr1:12337649 C A 1 0.00052 VPS13D missense variant p.Glu1335Ala
1 12359263 chr1:12359263 A G 1 0.00052 VPS13D missense variant p.Arg2013Lys
1 15855646 chr1:15855646 A C 1 0.00052 DNAJC16 missense variant p.Leu16Met
1 15892647 rs61738974 A G 1 0.00052 DNAJC16 missense variant p.Glu584Lys
1 16260959 chr1:16260959 C A 1 0.00057 SPEN missense variant p.Thr2742Pro
1 17277609 rs757471392 T C 1 0.00053 CROCC missense variant p.Arg1000Cys
1 19566385 chr1:19566385 A T 1 0.00052 EMC1 missense variant p.His294Leu
1 21952859 rs768044335 C G 1 0.00052 RAP1GAP missense variant p.Gln38Glu
1 22062929 chr1:22062929 G A 1 0.00052 USP48 missense variant p.Phe109Ser
1 22078025 chr1:22078025 A G 1 0.00052 USP48 missense variant p.Thr250Ile
1 22329533 chr1:22329533 A C 1 0.00054 CELA3A missense variant p.Ser27Arg
1 22973825 chr1:22973825 G C 1 0.00056 C1QC missense variant p.Pro96Arg
1 24409181 rs148441250 A G 1 0.00052 MYOM3 missense variant p.Thr666Met
1 24484263 chr1:24484263 T C 1 0.00053 IFNLR1 missense variant p.Arg307Gln
1 25166361 chr1:25166361 C G 1 0.00052 CLIC4 missense variant p.Arg142Ser
1 26784313 chr1:26784313 T C 1 0.00052 DHDDS missense variant p.Leu192Phe
1 26870645 chr1:26870645 G C 1 0.00052 RPS6KA1 missense variant p.Pro47Arg
1 26900915 rs755547404 A G 1 0.00054 RPS6KA1 missense variant p.Cys110Tyr
1 27734810 rs200665850 T C 1 0.00052 WASF2 missense variant p.Arg457Gln
1 32098045 chr1:32098045 A G 1 0.00055 HCRTR1 missense variant p.Glu366Lys
Supplementary Table S12. Clinical data for the five Swedish SLE patients carrying the MUC5B rs773068050 missense case-only allele.
Contents:
Supplementary Figure S3. Pathway polygenic risk score density plots for SLE patients and controls for each tested KEGG pathway. P-values represent
differences in PRS between SLE patients (SLE) and control individuals (HC), uncorrected P-values are presented (Bonferroni corrected threshold P=0.00143).
The dashed line indicates the PRS 97.5 percentile in control individuals.
Supplementary Figure S4. Pathway SLE polygenic risk score density plots for the four groups of SLE patients identified for each tested KEGG pathway. P-
values represent differences in PRS between clusters of SLE patients, uncorrected P-values are presented (Bonferroni corrected threshold P=0.00143).
Supplementary Figure S5. Results of SLE case-control association analyses with P-values for
association plotted against chromosomal location: a) Gene-based aggregate association testing
(SKAT-O) where each point represents a gene region. Gene names are indicated for the top gene
regions. The red line represents a Bonferroni corrected significance threshold and the black line FDR
0.05. Novel loci are indicated in bold. b) Single variant association testing where each point
represents a SNV. The red line represents a Bonferroni corrected significance threshold and the black
line the suggestive significance threshold (P<1×10-4). Novel loci are indicated in bold. c-e) Single
variant association result regional association plots for the CAPN13, MOB3B/IFNK and HAL regions
respectively. The colour scale indicates linkage disequilibrium (r2) between SNVs.
Supplementary Figure S6. Distribution of different classes of SNVs in SLE patients and control
individuals: a) rare non-synonymous variants, b) rare synonymous variants, c) rare constrained
variants (GERP RS score >2), d) rare non-coding variants (any of the following snpeff annotations:
sequence feature, upstream, downstream, intergenic, TF binding site variant). P-values represent
difference between SLE patients and control individuals, uncorrected P-values are presented
(Bonferroni corrected threshold P=0.0125).
Supplementary Figure S7. Genetic population structure of study individuals. a) Study samples
mapped on population reference samples. b) Principal components for population stratification
within study, PC1 vs PC2 c) PC2 vs PC3 d) PC3 vs PC4.
Supplementary Figure S8. SNV genotype average concordance between targeted sequencing and
genotyping by a beadchip array (Illumina ImmunoChip). Concordance for two types of SNVs, common
SNVs (MAF≥0.05) and low frequency SNVs (MAF<0.05), are displayed.