CORONAVIRUS SARS_COV_II
Danilo Nori
Contents
Genomic epidemiology of novel coronavirus (hCoV-19) .................................................................................... 3
Origin and continuing evolution of SARS-CoV-2” ...................................................................................... 4
Additional methodological issue ................................................................................................................... 8
On the origin and continuing evolution of SARS-CoV-2 ............................................................................. 9
Mutations in 103 SARS-CoV-2 genomes .............................................................................................. 9
Identification of a Novel Coronavirus in Patients with Severe Acute Respiratory Syndrome. .................... 9
Isolation And Characterization Of A Novel Coronavirus ....................................................................... 10
Rapid reconstruction of SARS-CoV-2 using a synthetic genomics platform. ............................................ 12
Cells and general culture conditions. ...................................................................................................... 13
Cultured viruses. ..................................................................................................................................... 14
Bacterial and yeast strains. ...................................................................................................................... 14
Identification of leader-body junctions of viral mRNAs. ....................................................................... 14
5' rapid amplification of cDNA ends (5’-RACE). .................................................................................. 14
Remdesivir experiment ........................................................................................................................... 15
Two major types of SARS-CoV-2 are defined by two SNPs that show complete linkage. ............. 15
The evolutionary history of L and S types of SARS-CoV-2 .............................................................. 16
Amino acid replacements ............................................................................................................................ 16
Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. ............................... 17
Data collection ....................................................................................................................................... 17
Cytokine and chemokine measurement .............................................................................................. 17
Detection of coronavirus in plasma ..................................................................................................... 18
Role of the funding source .................................................................................................................... 18
Results .................................................................................................................................................... 19
Table 3Treatments and outcomes of patients infected with 2019-nCoV.......................................... 26
Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. ............................................. 29
Fig. 1 Structure of 2019-nCoV S in the prefusion conformation. ....................................................... 29
Fig. 2 Structural comparison between 2019-nCoV S and SARS-CoV S. ............................................ 31
Fig. 3 2019-nCoV S binds human ACE2 with high affinity. ............................................................... 32
Fig. 4 Antigenicity of the 2019-nCoV RBD......................................................................................... 33
Crystal structure of the 2019-nCoV spike receptor-binding domain bound with the ACE2 receptor. ................... 34
The overall structure of 2019-nCoV RBD bound with ACE2. ................................................................... 35
Structural comparisons of 2019-nCoV and SARS-CoV RBDs and their binding modes to the ACE2
receptor. ...................................................................................................................................................... 36
X-ray Structure of Main Protease of the Novel Coronavirus SARS-CoV-2 Enables Design of α-Ketoamide
Inhibitors ...................................................................................................................................................... 38
Chemical structures of α-ketoamide inhibitors 11r, 13a, and 13b ..................................................... 40
Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein..................................................... 41
RESULTS ................................................................................................................................................... 43
ACE2 is an entry receptor for SARS-CoV-2 ...................................................................................... 43
SARS-CoV-2 recognizes human ACE2 with comparable affinity than SARS-CoV. ...................... 45
The architecture of the SARS-CoV-2 spike glycoprotein trimer ................................................................ 45
Figure 3.CryoEM structures of the SARS-CoV-2 S glycoprotein. ............................................... 46
References. ................................................................................................................................................... 47
Genomic epidemiology of novel coronavirus (hCoV-19)
This phylogeny shows evolutionary relationships of HCoV-19 viruses from the ongoing novel
coronavirus COVID-19 pandemic. All samples are still closely related with few mutations
relative to a common ancestor, suggesting a shared common ancestor sometime in Nov-Dec 2019.
This indicates an initial human infection in Nov-Dec 2019 followed by sustained human-tohuman transmission leading to sampled infections.
Site numbering and genome structure use Wuhan-Hu-1/2019 as a reference. The phylogeny is rooted
relative to early samples from Wuhan. Temporal resolution assumes a nucleotide substitution rate
of 5 × 10^-4 subs per site per year. Full details on bioinformatic processing can be found here.
We gratefully acknowledge the authors, originating and submitting laboratories of the genetic sequence
and metadata made available through GISAID on which this research is based. A full listing of all
originating and submitting laboratories is available below. An attribution table is available by
clicking on "Download Data" at the bottom of the page and then clicking on "Strain Metadata" in
the resulting dialog box[1].
Origin and continuing evolution of SARS-CoV-2”
This criticism concerns the claim that there are two definable “major types” of SARS-CoV2 in this
outbreak and that they have differentiable transmission rates.
Tang et al. term these two types L and S type: “two major types (L and S types): the S type is ancestral,
and the L type evolved from S type. Intriguingly, the S and L types can be clearly defined by just two
tightly linked SNPs at positions 8,782 (orf1ab: T8517C, synonymous) and 28,144 (ORF8: C251T,
S84L).”
One nonsynonymous mutation, which has not been assessed for functional significance, is not sufficient
to define a distinct “type” nor “major type”. As of 2nd March 2020, 111 nonsynonymous mutations have
been identified in the outbreak, these have been cataloged here in the CoV-GLUE resource 191 and can
be visualized in Figure 1. At current, there is no evidence that any of these 111 mutations have any
significance in a functional context of within-host infections or transmission rates. Additionally, when
you choose to define “types” purely based on two mutations, it is not intriguing that these “types” then
differ by those two mutations[2].
Figure 1. A visualization of the 111 nonsynonymous mutations (red) observed to date in the COVID-19
outbreak by plotting a grid of mutations where each column is a sample and each row is one of the
observed mutations in the phylogeny. The columns are ordered by the position of each sample in the
phylogeny. Synonymous mutations are shown in yellow. The C251T (nonsynonymous) and T8517C
(synonymous) mutations are visible on the right side of the plot.
However, they further claim that these two types have different transmission rates: “Thus far, we found
that, although the L type is derived from the S type, L (~70%) is more prevalent than S (~30%) among the
sequenced SARS-CoV-2 genomes we examined. This pattern suggests that L has a higher transmission
rate than the S type.” The abstract of the paper goes even further, stating outright that: “the S type, which
is evolutionarily older and less aggressive…”
It is, however, important to appreciate that finding a majority of samples with a particular mutation is not
evidence that viruses with that mutation transmit more readily. To make this claim would, at very
minimum, require a comparison to be made to expectations under a null distribution assuming equal
transmission rates. As this has not been performed by the authors, we believe there is insufficient
evidence to make this suggestion, and therefore it is incorrect (and irresponsible) to state that there is any
difference in transmission rates. Differences in the observed numbers of samples with and without this
mutation are far more likely to be due to stochastic epidemiological effects.
Basic evolutionary theory predicts that selectively neutral mutations change in frequency over time
through the process of genetic drift. In a viral outbreak, each transmission event from one infected person
to another is a random probabilistic event, with some infected individuals transmitting more or less often
than others. People may transmit at higher rates than others for a variety of reasons, e.g. because they
cough onto their palms and use overcrowded public transport, or just because their friends and coworkers
got lucky (or unlucky!). These small-scale epidemiological phenomena add up over time to create
substantial variation in the frequencies of mutations observed during an outbreak.
Additionally, when a virus spreads to a new area/country that was previously uninfected, a founder effect
can occur. As a small number of virus copies rapidly spread into an epidemic, any mutations in the initial
viral infections will rapidly become very common, even if they were initially rare in the country that
seeded the transmission. This is particularly likely to be the case in an outbreak caused by a novel virus
such as COVID-19 as there are a large number of susceptible hosts for the virus. These founder effects
have been observed in previous studies of viral outbreaks (e.g., Foley et al. 2004; Rai et al. 2010;
Tsetsarkin et al. 2011). Combined, these factors mean that the frequency of a particular mutation in and of
itself is not suggestive of any functional significance. Evidence from the widespread media uptake (35
articles at last count), and many comments on social media in response to this article, suggests that the
unsupported claims made by Tang et al. have already spread undue fear.
It’s also important to appreciate that the smaller the population of viruses is, the more these small scale
variations are likely to affect the frequency of mutations (in the same way that the more coins you flip, the
closer to the 0.5 heads average you expect to be). Given that this mutation appears to have occurred very
early on in the outbreak when fewer individuals were infected, it’s frequency will very likely have been
particularly influenced by genetic drift.
Tang et al. compare the frequencies of nonsynonymous and synonymous mutations in the data, claiming
that there is significant evidence of selection suppressing the frequency of nonsynonymous mutations in
the outbreak. This analysis is flawed on three grounds:
(1) The numbers in this figure do not make sense. According to the presented data, seven (synonymous)
mutations have a derived frequency of >50%, and two of these mutations have derived frequencies
greater than 95% in the population. A cursory glance at the tree (Figure 2; taken from Nextstrain 213)
shows that this cannot be true. “Derived” in this context should mean since the last common ancestor of
the outbreak. For two mutations to have derived frequencies greater than 95%, there would need to be a
small number of samples which branch as a sister lineage to the rest of the outbreak tree. However, this is
not the case.
833×655
Figure 2. A screenshot of the SARS-CoV-2 time tree phylogeny from NextStrain 213. Colors indicate the
geographic location of the sample. The date of sampling is shown below the tree.
The only way Tang et al. can get the results they present is by defining the ancestral state as being at some
point way back in the bat coronavirus tree before the outbreak began.
They then estimate the ancestral state for each mutation independently, ignoring the very informative tree
of the current outbreak. This method only makes sense when using a much more closely related outgroup
species, to infer the ancestral states of mutations in a freely recombinant species with unlinked mutations
with independent ancestry. Whereas the most recent common ancestor of SARS-CoV-2 and the nearest
bat sarbecovirus is shared many decades. Additionally, such methods should incorporate the inherent
uncertainty in inferring the ancestral state (e.g., est-sfs; Keightley and Jackson 2018), which Tang’s
implementation does not.
Implementing this method of inferring ancestral states in a viral context, where we assume there is no
recombination, means that “high frequency derived mutations” are just new mutations in the outbreak that
have mutated back to the inferred ancestral state (in bats). This is a completely meaningless definition of
“derived”. These high frequencies derived mutations should instead be classed as low frequency derived
mutations.
Tang et al. claim 16.3% of (7 out of 43) synonymous mutations have a derived frequency >0.5. However,
given the levels of synonymous divergence, and remembering that mutations probabilities are biased,
which increases the likelihood of back-mutations, this 16.3% figure is broadly in line with the expected
proportion of synonymous mutations that would back-mutate to the nucleotide found in bat infecting
strains. Because nonsynonymous sites are much less diverged (<4%) than synonymous sites (19%) to the
most closely related bat sequence, new nonsynonymous mutations are much more likely to be away from
the inferred ancestral state in bats than new synonymous mutations are. Therefore, using this flawed
definition of “derived”, a much smaller proportion of nonsynonymous mutations are expected to be high
frequency “derived” mutations without any action of natural selection at all.
(2) The way this data has been presented in Tang et al.’s Figure 2 will falsely suggest that purifying
selection is acting even if their methodology was sensible, and there was no such selection. The height of
the bars in their figure compares the raw numbers of mutations at each frequency without scaling the
heights of the bars for the number of each class of mutation. Because there is a greater number of
nonsynonymous polymorphisms than synonymous polymorphisms in the population, and as most
mutations are expected to be at low frequency (regardless of the action of natural selection), this
presentation will always make it look like there are proportionately more low-frequency nonsynonymous
mutations.
(3) When interpreting their results, Tang et al. do not consider that sequencing error could be a driver of a
relative excess of singleton nonsynonymous mutations. This possibility is important because sequencing
errors will be at low frequency as they are rare and cannot be transmitted, but real mutations can be at any
frequency because they can be transmitted. Additionally, purifying selection can only act on real
mutations and not sequencing errors. Therefore sequencing error may have a higher nonsynonymous to
synonymous ratio, and these mutations will be at low frequency, which will mimic the action of purifying
selection suppressing the frequency of nonsynonymous mutations.
Taken together, Tang’s analysis tells us absolutely nothing about purifying selection within the viral
outbreak. We have performed an additional analysis below to test for signatures of purifying selection in
the SARS-CoV2 outbreak.
Additional methodological issue
The authors used the software PAML 8 (Yang et al. 2007) to estimate selection parameters. PAML does
not allow for synonymous rate variation, but they explicitly state in the paper they believe there are
mutational hotspots. Recent work has shown that false-positive rates of positive selection inference are
unacceptably high when such synonymous rate variation occurs (Wisotsky et al. 2020). Therefore, if
there truly is synonymous rate variation, to reliably identify signatures of positive selection within the
phylogeny of SARS-CoV2, methods in which model mutation rate variation must be used (e.g., provided
by many of the models from the Hyphy 14 package).
Our Additional análisis
To test for potential purifying selection simply and robustly, the number of observed synonymous and
nonsynonymous mutations was compared to the null expectation by comparing the relative number of
synonymous and nonsynonymous sites.
The relative number of sites was estimated using the Goldman and Yang (1994) codon model. This
model estimates mutation probabilities between all 61 possible coding codons using the observed
frequencies of each of the 61 codons weighted by the transition to transversion ratio estimated from
the data (2.9). It estimates there are 2.43 times more nonsynonymous than synonymous sites in the
SARS-CoV2 genome.
This null expectation under no selection was compared to that observed from the outbreak data using a
chi-squared test on the below table. This yielded a non-significant P-value of 0.113. This result is not
unexpected, as the current rapid growth rate of the viral population is likely to allow viruses with unfit
mutations, as well as viruses with neutral mutations to be transmitted. However, we urge caution in
over analyzing these results, as statistical power is limited until more sequencing data accumulates.
On the origin and continuing evolution of SARS-CoV-2
Mutations in 103 SARS-CoV-2 genomes
We downloaded 103 publicly available SARS-CoV-2 genomes, aligned the sequences, and
identified the genetic variants. For ease of visualization, we marked each virus strain based on
the location and date the virus was isolated with the format of "Location_Date” throughout
this study (see Table S1 for details; Each ID did not contain information of the patient's race
or ethnicity). Although SARS-CoV-2 is an RNA virus, for simplicity, we presented our
results based on DNA sequencing results throughout this study (i.e., the nucleotide T
(thymine) means U (uracil) in SARS-CoV-2). For each variant, the ancestral state was
inferred based on the genome and CDS alignments of SARS-CoV-2 (NC_045512), RaTG13,
and GD Pangolin-CoV (Materials and Methods). In total, we identified mutations in 149 sites
across the 103 sequenced strains. Ancestral states for 43 synonymous, 83 non-synonymous,
and two stop-gain mutations were unambiguously inferred. The frequency spectra of
synonymous and nonsynonymous mutations.
Most derived mutations were singletons (67.4% (29/43) of synonymous mutations and 84.3%
(70/83) of nonsynonymous mutations), indicating either a recent origin [30] or population
growth [31]. In general, the derived alleles of synonymous mutations were significantly
skewed towards higher frequencies than those of nonsynonymous ones (P < 0.01, Wilcoxon
rank-sum test; Fig. 2), suggesting the nonsynonymous mutations tended to be selected against.
However, 16.3% (7 out of 43) synonymous mutations, and one nonsynonymous (ORF8
(L84S, 28,144) The mutation had a derived frequency of ≥ 70% across the SARS-CoV2 strains.
The nonsynonymous mutations that had derived alleles in at least two SARS-CoV-2 strains
affected six proteins: orf1ab (A117T, I1607V, L3606F, I6075T), S (H49Y, V367F), ORF3a
(G251V), ORF7a (P34S), ORF8 (V62L, S84L), and N (S194L, S202N, P344S).
Identification of a Novel Coronavirus in Patients with Severe Acute
Respiratory Syndrome.
A large number of tests for known respiratory pathogens were performed with specimens from all three
patients in Frankfurt. The test results were negative, except as follows.
Paramyxovirus-like particles were seen in throat swabs and sputum samples from the index patient by
electron microscopy. The particles were scarce. However, several PCR tests specific for virus species of
the family Paramyxoviridae were negative (including tests for human metapneumovirus), as were PCR
assays based on primers designed to react broadly with all members of that family.
Isolation And Characterization Of A Novel Coronavirus
After six days of incubation (on March 21), a cytopathic effect was seen on Vero-cell cultures inoculated
with sputum obtained from the index patient on day 7. Twenty-four hours after a single passage, nucleic
acids were purified from the supernatant. Random amplification was performed with 15 different PCRs
under low-stringency conditions. We had previously shown that this method can detect unknown
pathogens growing in cell culture (unpublished data). To detect RNA viruses, an initial reversetranscription step was included.
Ilustración 1Genetic Characterization of the Novel Coronavirus.
About 20 distinct DNA fragments were obtained and sequenced. The resulting sequences were subjected
to BLAST database searches. Most of the fragments matched human chromosome sequences, indicating
that the genetic material of the cultured cells had been amplified (Vero cells are derived from monkeys).
Three of the fragments did not match any nucleotide sequence in the database. However, when a
translated BLAST search was performed (comparison of the amino acid translation in all six possible
reading frames with the database), these fragments showed homology to coronavirus amino acid
sequences, indicating that a coronavirus had been isolated. Two of the fragments were 300 nucleotides in
length and identical in sequence, and the third fragment was 90 nucleotides in length (sequences BNI-1
and BNI-2, respectively, as reported on the Web site of the WHO network on March 25).
Detailed sequence analysis revealed that both fragments were located in the open reading frame 1b of
coronaviruses and did not overlap with a 400-nucleotide coronavirus fragment identified by colleagues at
the Centers for Disease Control and Prevention (CDC) (sequence CDC, reported on the Web site of the
WHO network on March 24).
Rapid reconstruction of SARS-CoV-2 using a synthetic genomics platform.
The emergence of a novel CoV in China at the end of 2019 prompted
us to test the applicability of our synthetic genomics platform to reconstruct the virus based on the
genome sequences released on January 10-11, 2020. We divided the genome into 12 overlapping DNA
fragments. In parallel, we aimed to generate a SARS-CoV-2 expressing GFP that could be valuable to
facilitate the screening of antiviral compounds and to establish diagnostic assays (e.g. virus neutralization
assay).
Fourteen synthetic DNA fragments were ordered as sequence-confirmed plasmids and all but fragments 5
and 7 were delivered. Since we received at the same time SARS-CoV-2 viral RNA from an isolate of a
Munich patient (BetaCoV/Germany/BavPat1/2020), we amplified the regions of fragments 5 and 7 by
RT-PCR (Supplementary Table 1). TAR cloning was immediately initiated, and for all six SARS-CoV2/SARS-CoV-2-GFP constructs we obtained correctly assembled molecular. Since sequence verification
was not possible within this short time frame, we randomly selected two clones for each construct,
isolated the YAC DNA, and performed in vitro transcription. The resulting RNAs were electroporated
together with an mRNA encoding the SARS-CoV-2 N protein into BHK-21 and, in parallel, into BHKSARS-N cells expressing the SARS-CoV N protein19 Electroporated cells were seeded over VeroE6 cells
and two days later we observed green fluorescent signals in cells that received the GFP-encoding
SARS-CoV-2 RNAs. Indeed, we could rescue infectious viruses for almost all rSARS-CoV-2 and
rSARS-CoV-2-GFP.[3]
As shown in Figure 3b for rSARS-CoV-2 clones 1.1, 2.2, and 3.1, plaques were readily detectable,
demonstrating that infectious virus has been recovered irrespectively of the 5’-termini. Sequencing of
the YACs and corresponding rescued viruses revealed that almost all DNA clones and viruses contained
the correct sequence, except for some individual clones carrying mutations within fragments 5 and 7,
likely introduced by RT-PCR. Nevertheless, we obtained at least one correct YAC clone for all constructs
except for construct 6. To correct this, we re-assembled construct 6 by replacing the RT-PCR-generated
fragments 5 and 7 with 4 and 3 shorter synthetic dsDNA fragments, respectively. The resulting molecular
clone was used to rescue the synSARS-CoV-2-GFP virus without any mutations exclusively from
chemically synthesized DNA. Next, we assessed the 5’-end of the recombinant viruses and the Munich
virus isolate and confirmed the published 5’-end sequence of SARS-CoV-2 (5’-AUUAAAGG; Genbank
MN996528.3). Full-length sequencing of the viral genomes and 5’-RACE analysis of each recombinant
virus confirmed the identity of each virus and showed that each virus 5’-end variant retained the cloned 5terminus. This demonstrates that the 5’-ends of SARS-CoV and bat SARS-related CoVs ZXC21 and
ZC45 are compatible with the replication machinery of SARS-CoV-2. Sequencing results also revealed
the identity of leader-body junctions of SARS-CoV-2 subgenomic mRNAs, which are identical to those
of SARS-CoV. We also analyzed rSARS-CoV-2 clone 3.1 for protein expression and demonstrated the
presence of SARS-CoV-2 nucleocapsid protein in dsRNA-positive cells. Replication kinetics of rSARSCoV-2 clone 3.1 containing the authentic 5’-terminus was indistinguishable from replication of the
SARS-CoV-2 isolate, while clones 1.1 and 2.2 showed slightly reduced replication. All rSARS-CoV-GFP
clones and synSARS-CoV-GFP displayed similar growth kinetics but were significantly reduced
compared to the SARS-CoV-2 isolate, suggesting that the insertion of GFP and/or the partial deletion of
ORF7a affects replication. Despite the reduced replication, green fluorescence was readily detectable and
we demonstrated the utility of synSARS-CoV-GFP for antiviral drug screening by testing remdesivir, a
promising compound for COVID19 treatment. Similarly, the simple readout of green fluorescence greatly
facilitates the demonstration of virus neutralisations with human serum.
Reconstruction, rescue, and characterization of rSARS-CoV-2, rSARS-CoV-2-GFP, and synSARS-CoV2-GFP. a Schematic representation of the SARS-CoV-2 genome organization and DNA fragments used to
clone rSARS-CoV-2, rSARS-CoV-2-GFP, and synSARS-CoV-2-GFP. Inserts show synthetic subfragments comprising fragments 5 (A-D) and 7 (Aa, Ab, B), and the fragments used to insert the GFP
gene (fragments 13-15). kb, kilobase.
Cells and general culture conditions.
Vero, VeroB4 and VeroB6 cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM); BHK21, BHK-MHV-N (BHK-21 cells expressing MHV-A59 N protein), BHK-SARS-N (BHK-21 cells
expressing SARS N protein)19, Huh-723, L92923, and murine 17Cl-123 cells were grown in minimal
essential medium (MEM). Both types of media were supplemented with 10% fetal bovine serum, 1X nonessential amino acids, 100 units/ml penicillin, and 100 µg/ml streptomycin. BHK-SARS-N cells were
grown using MEM supplemented with 5% fetal bovine serum, 1X non-essential amino acids, 100
units/ml penicillin, and 100 µg/ml streptomycin, 500 µg/ml G418 and 10 µg/ml puromycin. Twenty-four
hours before electroporation, BHK-MHV-N and BHK-SARS-N were treated with 1 µg/ml Doxycyclin.
All cells were maintained at 37o C and in a 5% CO2 atmosphere.
Cultured viruses.
MHV-GFP and HCoV-229E were cultured in murine 17Cl-1 and Huh-7 cells, respectively. MERS-CoVEMC24 was cultured in VeroB4 cells. HCoV-HKU1 strain Caen-1 (GenBank: NC_006577) was cultured
on human airway epithelial cultures. ZIKA virus strain PRVABC-59 (GenBank: KX377337) was kindly
provided by Marco Alves (Institute of Virology and Immunology) and cultured on Vero cells. SARSCoV-2 (SARS-CoV-2/München-1.1/2020/929) was cultured on VeroE6 cells.
Bacterial and yeast strains.
Escherichia coli DH5α (Thermo Scientific™) and TransforMax™ Epi300™ (Epicentre) were used to
propagate the pVC604 and pCC1BAC-His3 TAR vectors, respectively. The bacteria were grown in
lysogeny media (LB) supplemented with appropriate antibiotics at 37 °C overnight. E. coli Epi300™ cells
harboring different SARS-CoV-2 synthetic fragments on pUC57/pUC57mini were grown at 30 °C to
lower instability/toxicity risks. Saccharomyces cerevisiae VL6-48N (MATα trp1-Δ1 ura3-Δ1 ade2-101
his3-Δ200 lys2 met14 cir°) was used for all yeast transformation experiments26. Yeast cells were first
grown in YPDA broth (Takara Bio), and transformed cells were plated on minimal synthetic defined (SD)
agar without histidine (SD-His) (Takara Bio). S. cerevisiae VL6- 48N derived clones carrying different
YACs were never streaked out together on the same agar dishes since mating switching and resulting
recombination might occur at a very low frequency.
Identification of leader-body junctions of viral mRNAs.
To identify reads that mapped discontinuously to the SARS-CoV-2 genome and determine the location of
potential transcription regulatory sites (TRS), we pooled reads that mapped to the viral genome as well as
unmapped reads and searched for the sequence TTCTCTAA ACGAAC (nucleotides 62 to 75 of
MT108784; leader TRS underlined). We then filtered for reads that had at least 18 nucleotides 3’ of the
aforementioned sequence and evaluated whether these reads were compatible with any of the SARSCoV-2 mRNA sequences. Reads matching these criteria were used as input for the generation of a
consensus sequence for each TRS site and analyzed using a combination of SAMtools (version 1.10), R,
and the Integrative Genomics Viewer (IGV). Mapped read depth was also calculated for the
discontinuously mapped reads as explained in the previous section.
5' rapid amplification of cDNA ends (5’-RACE).
Recombinant SARS-CoV-2(-GFP) poly(A)-purified RNA used for NGS was also used to determine the
genome 5’-ends by 5’-RACE. M-MLV Reverse transcription (Promega) was performed according to the
manufacturer's instructions using the gene-specific primer pWhSF-ORF1a-R18-655 (Supplementary
Table 1) and RNase Inhibitor RNasin plus (Promega) 10U per 25 µl reaction volume. Following the
reverse transcription, 1 µl RNase H (5U/µl, New England Biolabs) per 25 µl reaction was added, and the
mixture was incubated at 37 °C for 20 min. The cDNA was immediately purified with the High Pure PCR
Product Purification Kit (Roche) according to the manufacturer's instructions. A poly (A) tail was added
to the cDNA with Terminal Transferase (New England Biolabs) according to the manufacturer's
instructions. Subsequently, a PCR reaction with the tailed cDNA was performed with the primer pair
pWhSF-ORF1a-R18-655/TagRACE_ dT16 (Supplementary Table 1) using the HotStarTaq Master Mix
(Qiagen) according to the manufacturer's instructions with a touchdown cycling protocol: 95 °C for 15
min; 15 cycles of 94 °C for 30 sec, 65 °C touch down to 50 °C for 1 min, 72 °C for 1 min; 25 cycles of
94 °C for 30 sec, 50 °C for 1 min, 72 °C for 1 min. Subsequently, 1µl of this reaction was used for a
nested pre-amplification with the primer pair pWhSF-5utr-R17-273/TagRACE (Supplementary Table 1)
in a final volume of 50 µl following the same cycling protocol as described above. The PCR fragment
was purified using the NucleoSpin™ Gel and PCR Clean-up Kit (Macherey-Nagel) according to the
manufacturer's instructions, and the purified PCR fragment was sent to Microsynth AG (Switzerland) for
Sanger sequencing with the primer pWhSF-5utr-R17-273 (Supplementary Table 1). Sequencing raw data
were assessed using the SeqManTM II sequence analysis software (DNASTAR Inc., Madison, USA).
Remdesivir experiment
Remdesivir (MedChemExpress) was dissolved in DMSO and stored at -80 °C in 20 mM stock aliquots.
One day before the experiment, VeroE6 cells were seeded in 24-well plates at a density of 8 x 104 cells
per well. Cells were infected with synSARS-CoV-2-GFP (passage 1) at MOI = 0.01, or mock-infected as
control. Inocula were removed at 1 h.p.i, and replaced with medium containing Remdesivir at a
concentration of 0.2 µM, 2 µM or the equivalent amount of DMSO. At 48 h.p.i., cells were washed once
with PBS and incubated in fresh PBS. Images were acquired using an EVOS fluorescence microscope
equipped with a 10x air objective. Brightness and contrast were adjusted identically for each condition
and their corresponding control using FIJI.
Two major types of SARS-CoV-2 are defined by two SNPs that show complete
linkage.
To detect the possible recombination among SARS-CoV2 viruses, we used Haploview [32] to
analyze and visualize the patterns of linkage disequilibrium (LD) between variants with minor
alleles in at least two SARS-CoV-2 strains (Fig. 3A). Since most mutations were at very low
frequencies, it is not surprising that many pairs had a very low r2 or LOD value (Fig. 3B-C).
Consistent with another recent report [31], we did not find evidence of recombination
between the SARS-CoV2 strains.
However, we found that SNPs at location 8,782 (orf1ab: T8517C, synonymous) and 28,144
(ORF8: C251T, S84L) showed significant linkage, with an r2 value of 0.954 (Fig. 3B, red)
and a LOD value of 50.13 (Fig. 3C, red). Among the 103 SARS-CoV-2 virus strains, 101 of
they exhibited complete linkage between the two SNPs: 72 strains exhibited a “CT”
haplotype (defined as “L” type because T28,144 is in the codon of Leucine) and 29 strains
exhibited a “TC” haplotype (defined as “S” type because C28,144 is in the codon of Serine)
at these two sites. Thus, we categorized the SARS-CoV-2 viruses into two major types, with
L being the major type (~70%) and S being the minor type (~30%).
The evolutionary history of L and S types of SARS-CoV-2
Although we defined the L and S types based on two tightly linked SNPs, strikingly, the
the separation between the L (blue) and S (red) types was maintained when we reconstructed the
haplotype networks using all the SNPs in the SARS-CoV-2 genomes (Fig. 4A; the number of
mutations between two neighboring haplotypes were inferred parsimoniously). This analysis
further supports the idea that the two linked SNPs at sites 8,782 and 28,144 adequately define
the L and S types of SARS-CoV-2[4].
Amino acid replacements
[5]
Clinical features of patients infected with 2019 novel coronavirus in Wuhan,
China.
Initial investigations included a complete blood count, coagulation profile, and serum biochemical test
(including renal and liver function, creatine kinase, lactate dehydrogenase, and electrolytes). Respiratory
specimens, including nasal and pharyngeal swabs, bronchoalveolar lavage fluid, sputum, or bronchial
aspirates were tested for common viruses, including influenza, avian influenza, respiratory syncytial
virus, adenovirus, parainfluenza virus, SARS-CoV and MERS-CoV using real-time RT-PCR assays
approved by the China Food and Drug Administration. Routine bacterial and fungal examinations were
also performed.
Given the emergence of the 2019-nCoV pneumonia cases during the influenza season, antibiotics (orally
and intravenously) and oseltamivir (orally 75 mg twice daily) were empirically administered.
Corticosteroid therapy (methylprednisolone 40–120 mg per day) was given as a combined regimen if
severe community-acquired pneumonia was diagnosed by physicians at the designated hospital. Oxygen
support (eg, nasal cannula and invasive mechanical ventilation) was administered to patients according to
the severity of hypoxaemia. Repeated tests for 2019-nCoV were done in patients confirmed to have 2019nCoV infection to show viral clearance before hospital discharge or discontinuation of isolation.
Data collection
We reviewed clinical charts, nursing records, laboratory findings, and chest x-rays for all patients with
laboratory-confirmed 2019-nCoV infection who were reported by the local health authority. The
admission data of these patients were from Dec 16, 2019, to Jan 2, 2020. Epidemiological, clinical,
laboratory, and radiological characteristics and treatment and outcomes data were obtained with
standardized data collection forms (modified case record form for severe acute respiratory infection
clinical characterization shared by WHO and the International Severe Acute Respiratory and Emerging
Infection Consortium) from electronic medical records. Two researchers also independently reviewed the
data collection forms to double-check the data collected. To ascertain the epidemiological and symptom
data, which were not available from electronic medical records, the researchers also directly
communicated with patients or their families to ascertain epidemiological and symptom data.
Cytokine and chemokine measurement
To characterize the effect of coronavirus on the production of cytokines or chemokines in the acute phase
of the illness, plasma cytokines and chemokines (IL1B, IL1RA, IL2, IL4, IL5, IL6, IL7, IL8 (also known
as CXCL8), IL9, IL10, IL12p70, IL13, IL15, IL17A, Eotaxin (also known as CCL11), basic FGF2, GCSF
(CSF3), GMCSF (CSF2), IFNγ, IP10 (CXCL10), MCP1 (CCL2), MIP1A (CCL3), MIP1B (CCL4),
PDGFB, RANTES (CCL5), TNFα, and VEGFA were measured using Human Cytokine Standard 27-Plex
Assays panel and the Bio-Plex 200 system (Bio-Rad, Hercules, CA, USA) for all patients according to the
manufacturer's instructions. The plasma samples from four healthy adults were used as controls for crosscomparison. The median time from being transferred to a designated hospital to the blood sample
collection was 4 days (IQR 2–5).
Detection of coronavirus in plasma
Each 80 μL plasma sample from the patients and contacts was added into 240 μL of Trizol LS (10296028;
Thermo Fisher Scientific, Carlsbad, CA, USA) in the Biosafety Level 3 laboratory. Total RNA was
extracted by Direct-zol RNA Miniprep kit (R2050; Zymo Research, Irvine, CA, USA) according to the
manufacturer's instructions and 50 μL elution was obtained for each sample. 5 μL RNA was used for realtime RT-PCR, which targeted the NP gene using AgPath-ID One-Step RT-PCR Reagent (AM1005;
Thermo Fisher Scientific). The final reaction mix concentration of the primers was 500 nM and the probe
was 200 nM. Real-time RT-PCR was performed using the following conditions: 50°C for 15 min and
95°C for 3 min, 50 cycles of amplification at 95°C for 10 s and 60°C for 45 s. Since we did not perform
tests for detecting the infectious virus in the blood, we avoided the term viraemia and used RNAaemia
instead. RNAaemia was defined as a positive result for real-time RT-PCR in the plasma sample.
Role of the funding source
The funder of the study had no role in study design, data collection, data analysis, data interpretation, or
writing of the report. The corresponding authors had full access to all the data in the study and had final
responsibility for the decision to submit for publication.
Results
By Jan 2, 2020, 41 admitted hospital patients were identified as laboratory-confirmed 2019-nCoV
infection in Wuhan. 20 [49%]) of the 2019-nCoV-infected patients were aged 25–49 years, and 14 (34%)
were aged 50–64 years (figure 1A). The median age of the patients was 49·0 years (IQR 41·0–58·0; table
1). In our cohort of the first 41 patients as of Jan 2, no children or adolescents were infected. Of the 41
patients, 13 (32%) were admitted to the ICU because they required high-flow nasal cannula or higherlevel oxygen support measures to correct the hypoxaemia. Most of the infected patients were men (30
[73%]); less than half had underlying diseases (13 [32%]), including diabetes (eight [20%]), hypertension
(six [15%]), and cardiovascular disease (six [15%]).
The blood counts of patients on admission showed leucopenia (white blood cell count less than 4 × 109/L;
ten [25%] of 40 patients) and lymphopenia (lymphocyte count <1·0 × 109/L; 26 [63%] patients; table 2).
Prothrombin time and D-dimer level on admission were higher in ICU patients (median prothrombin time
12·2 s [IQR 11·2–13·4]; median D-dimer level 2·4 mg/L [0·6–14·4]) than non-ICU patients (median
prothrombin time 10·7 s [9·8–12·1], p=0·012; median D-dimer level 0·5 mg/L [0·3–0·8], p=0·0042).
Levels of aspartate aminotransferase were increased in 15 (37%) of 41 patients, including eight (62%) of
13 ICU patients and seven (25%) of 28 non-ICU patients. Hypersensitive troponin I (hs-cTnI) was
increased substantially in five patients, in whom the diagnosis of virus-related cardiac injury was made.
Table 2Laboratory findings of patients infected with 2019-nCoV on admission to hospital
All patients
(n=41)
ICU care
(n=13)
No ICU
care (n=28)
p value
6·2 (4·1–
10·5)
11·3 (5·8–
12·1)
5·7 (3·1–
7·6)
0·011
<4
10/40 (25%)
1/13 (8%)
9/27 (33%)
0·041
4–10
18/40 (45%)
5/13 (38%)
13/27 (48%)
..
>10
12/40 (30%)
7/13 (54%)
5/27 (19%)
..
Neutrophil count, × 109/L
5·0 (3·3–8·9)
10·6 (5·0–
11·8)
4·4 (2·0–
6·1)
0·00069
Lymphocyte count, × 109/L
0·8 (0·6–1·1)
0·4 (0·2–
0·8)
1·0 (0·7–
1·1)
0·0041
<1·0
26/41 (63%)
11/13 (85%)
15/28 (54%)
0·045
≥1·0
15/41 (37%)
2/13 (15%)
13/28 (46%)
..
White blood cell count,
× 109/L
All patients
(n=41)
ICU care
(n=13)
No ICU
care (n=28)
p value
126·0
(118·0–
140·0)
122·0
(111·0–
128·0)
130·5
(120·0–
140·0)
0·20
164·5
(131·5–
263·0)
196·0
(165·0–
263·0)
149·0
(131·0–
263·0)
0·45
<100
2/40 (5%)
1/13 (8%)
1/27 (4%)
0·45
≥100
38/40 (95%)
12/13 (92%)
26/27 (96%)
..
Prothrombin time, s
11·1 (10·1–
12·4)
12·2 (11·2–
13·4)
10·7 (9·8–
12·1)
0·012
Activated partial
thromboplastin time, s
27·0 (24·2–
34·1)
26·2 (22·5–
33·9)
27·7 (24·8–
34·1)
0·57
D-dimer, mg/L
0·5 (0·3–1·3)
2·4 (0·6–
14·4)
0·5 (0·3–
0·8)
0·0042
Albumin, g/L
31·4 (28·9–
36·0)
27·9 (26·3–
30·9)
34·7 (30·2–
36·5)
0·00066
Alanine aminotransferase,
U/L
32·0 (21·0–
50·0)
49·0 (29·0–
115·0)
27·0 (19·5–
40·0)
0·038
Hemoglobin, g/L
Platelet count, × 10 /L
9
All patients
(n=41)
ICU care
(n=13)
No ICU
care (n=28)
p value
34·0 (26·0–
48·0)
44·0 (30·0–
70·0)
34·0 (24·0–
40·5)
0·10
≤40
26/41 (63%)
5/13 (38%)
21/28 (75%)
0·025
>40
15/41 (37%)
8/13 (62%)
7/28 (25%)
..
Total bilirubin, mmol/L
11·7 (9·5–
13·9)
14·0 (11·9–
32·9)
10·8 (9·4–
12·3)
0·011
Potassium, mmol/L
4·2 (3·8–4·8)
4·6 (4·0–
5·0)
4·1 (3·8–
4·6)
0·27
Sodium, mmol/L
139·0
(137·0–
140·0)
138·0
(137·0–
139·0)
139·0
(137·5–
140·5)
0·26
Creatinine, μmol/L
74·2 (57·5–
85·7)
79·0 (53·1–
92·7)
73·3 (57·5–
84·7)
0·84
≤133
37/41 (90%)
11/13 (85%)
26/28 (93%)
0·42
>133
4/41 (10%)
2/13 (15%)
2/28 (7%)
..
132·5 (62·0–
219·0)
132·0 (82·0–
493·0)
133·0 (61·0–
189·0)
0·31
Aspartate aminotransferase,
U/L
Creatine kinase, U/L
All patients
(n=41)
ICU care
(n=13)
No ICU
care (n=28)
p value
≤185
27/40 (68%)
7/13 (54%)
20/27 (74%)
0·21
>185
13/40 (33%)
6/13 (46%)
7/27 (26%)
..
286·0
(242·0–
408·0)
400·0
(323·0–
578·0)
281·0
(233·0–
357·0)
0·0044
≤245
11/40 (28%)
1/13 (8%)
10/27 (37%)
0·036
>245
29/40 (73%)
12/13 (92%)
17/27 (63%)
..
3·4 (1·1–9·1)
3·3 (3·0–
163·0)
3·5 (0·7–
5·4)
0·075
5/41 (12%)
4/13 (31%)
1/28 (4%)
0·017
0·1 (0·1–0·1)
0·1 (0·1–
0·4)
0·1 (0·1–
0·1)
0·031
<0·1
27/39 (69%)
6/12 (50%)
21/27 (78%)
0·029
≥0·1 to <0·25
7/39 (18%)
3/12 (25%)
4/27 (15%)
..
Lactate dehydrogenase,
U/L
Hypersensitive troponin I,
pg/mL
>28 (99th
percentile)
Procalcitonin, ng/mL
All patients
(n=41)
ICU care
(n=13)
No ICU
care (n=28)
p value
≥0·25 to <0·5
2/39 (5%)
0/12
2/27 (7%)
..
≥0·5
3/39 (8%)
0/27
..
3/12 (25%)
*
Bilateral involvement of
chest radiographs
40/41 (98%)
13/13
(100%)
27/28 (96%)
0·68
Cycle threshold of the
respiratory tract
32·2 (31·0–
34·5)
31·1 (30·0–
33·5)
32·2 (31·1–
34·7)
0·39
Data are median (IQR) or n/N (%), where N is the total number of patients with available data. p values
comparing ICU care and no ICU care are from χ2, Fisher's exact test, or Mann-Whitney U test. 2019nCoV=2019 novel coronavirus. ICU=intensive care unit.
* Complicated typical secondary infection during the first hospitalization.
Most patients had normal serum levels of procalcitonin on admission (procalcitonin <0·1 ng/mL; 27
[69%] patients; table 2). Four ICU patients developed secondary infections. Three of the four patients
with secondary infection had procalcitonin greater than 0·5 ng/mL (0·69 ng/mL, 1·46 ng/mL, and 6·48
ng/mL).
On admission, abnormalities in chest CT images were detected among all patients. Of the 41 patients, 40
(98%) had bilateral involvement (table 2). The typical findings of chest CT images of ICU patients on
admission were bilateral multiple lobular and subsegmental areas of consolidation (figure 3A). The
representative chest CT findings of non-ICU patients showed bilateral ground-glass opacity and
subsegmental areas of consolidation (figure 3B). Later chest CT images showed bilateral ground-glass
opacity, whereas the consolidation had been resolved (figure 3C).
Initial plasma IL1B, IL1RA, IL7, IL8, IL9, IL10, basic FGF, GCSF, GMCSF, IFNγ, IP10, MCP1,
MIP1A, MIP1B, PDGF, TNFα, and VEGF concentrations were higher in both ICU patients and non-ICU
patients than in healthy adults (appendix pp 6–7). Plasma levels of IL5, IL12p70, IL15, Eotaxin, and
RANTES were similar between healthy adults and patients infected with 2019-nCoV. Further comparison
between ICU and non-ICU patients showed that plasma concentrations of IL2, IL7, IL10, GCSF, IP10,
MCP1, MIP1A, and TNFα were higher in ICU patients than non-ICU patients.
All patients had pneumonia. Common complications included ARDS (12 [29%] of 41 patients), followed
by RNAaemia (six [15%] patients), acute cardiac injury (five [12%] patients), and secondary infection
(four [10%] patients; table 3). Invasive mechanical ventilation was required in four (10%) patients, with
two of them (5%) had refractory hypoxaemia and received extracorporeal membrane oxygenation as
salvage therapy. All patients were administered with empirical antibiotic treatment, and 38 (93%) patients
received antiviral therapy (oseltamivir). Additionally, nine (22%) patients were given systematic
corticosteroids. A comparison of clinical features between patients who received and did not receive
systematic corticosteroids is in the appendix (pp 1–5).
Table 3Treatments and outcomes of patients infected with 2019-nCoV
All patients
(n=41)
ICU care
(n=13)
No ICU
care
(n=28)
p value
7·0 (4·0–
8·0)
7·0 (4·0–
8·0)
7·0 (4·0–
8·5)
0·87
Acute respiratory distress
syndrome
12 (29%)
11 (85%)
1 (4%)
<0·0001
RNAaemia
6 (15%)
2 (15%)
4 (14%)
0·93
Cycle threshold of
RNAaemia
35·1 (34·7–
35·1)
35·1
(35·1–
35·1)
34·8 (34·1–
35·4)
0·35
Acute cardiac injury
5 (12%)
4 (31%)
1 (4%)
0·017
Duration from illness onset to
the first admission
Complications
All patients
(n=41)
ICU care
(n=13)
No ICU
care
(n=28)
p value
Acute kidney injury
3 (7%)
3 (23%)
0
0·027
Secondary infection
4 (10%)
4 (31%)
0
0·0014
Shock
3 (7%)
3 (23%)
0
0·027
Antiviral therapy
38 (93%)
12 (92%)
26 (93%)
0·46
Antibiotic therapy
41 (100%)
13 (100%)
28 (100%)
NA
Use of corticosteroid
9 (22%)
6 (46%)
3 (11%)
0·013
Continuous renal replacement
therapy
3 (7%)
3 (23%)
0
0·027
Oxygen support
..
..
..
<0·0001
27 (66%)
1 (8%)
26 (93%)
..
*
Treatment
Nasal cannula
All patients
(n=41)
ICU care
(n=13)
No ICU
care
(n=28)
p value
Non-invasive ventilation or
high-flow nasal cannula
10 (24%)
8 (62%)
2 (7%)
..
Invasive mechanical
ventilation
2 (5%)
2 (15%)
0
..
Invasive mechanical
ventilation and ECMO
2 (5%)
2 (15%)
0
..
..
..
..
0·014
Hospitalisation
7 (17%)
1 (8%)
6 (21%)
..
Discharge
28 (68%)
7 (54%)
21 (75%)
..
Death
6 (15%)
5 (38%)
1 (4%)
..
Prognosis
Data are median (IQR) or n (%). p values are comparing ICU care and no ICU care. 2019-nCoV=2019
novel coronavirus. ICU=intensive care unit. NA=not applicable. ECMO=extracorporeal membrane
oxygenation.[6]
Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation.
Based on the first reported genome sequence of 2019-nCoV (4), we expressed ectodomain residues 1 to
1208 of 2019-nCoV S, adding two stabilizing proline mutations in the C-terminal S2 fusion machinery
using a previous stabilization strategy that proved effective for other betacoronavirus S proteins
(11, 14). Figure 1A shows the domain organization of the expression construct, and figure S1 shows the
purification process. We obtained ~0.5 mg/liter of the recombinant prefusion-stabilized S ectodomain
from FreeStyle 293 cells and purified the protein to homogeneity by affinity chromatography and sizeexclusion chromatography (fig. S1). Cryo-electron microscopy (cryo-EM) grids were prepared using this
purified, fully glycosylated S protein, and preliminary screening revealed a high particle density with little
aggregation near the edges of the holes.
Fig. 1 Structure of 2019-nCoV S in the prefusion conformation.
(A) Schematic of 2019-nCoV S primary structure colored by domain. Domains that were excluded from
the ectodomain expression construct or could not be visualized in the final map are colored white. SS,
signal sequence; S2′, S2′ protease cleavage site; FP, fusion peptide; HR1, heptad repeat 1; CH, central
helix; CD, connector domain; HR2, heptad repeat 2; TM, transmembrane domain; CT, cytoplasmic tail.
Arrows denote protease cleavage sites. (B) Side and top views of the prefusion structure of the 2019nCoV S protein with a single RBD in the up conformation. The two RBD down protomers are shown as
cryo-EM density in either white or gray and the RBD up protomer is shown in ribbons colored
corresponding to the schematic in (A).
After collecting and processing 3207 micrograph movies, we obtained a 3.5-Å-resolution threedimensional (3D) reconstruction of an asymmetrical trimer in which a single RBD was observed in the up
conformation. (Fig. 1B, fig. S2, and table S1). Because of the small size of the RBD (~21 kDa), the
asymmetry of this conformation was not readily apparent until ab initio 3D reconstruction and
classification were performed (Fig. 1B and fig. S3).
By using the 3D variability feature in cryoSPARC v2 (15), we observed breathing of the S1 subunits as
the RBD underwent a hinge-like movement, which likely contributed to the relatively poor local
resolution of S1 compared with the more stable S2 subunit (movies S1 and S2). This seemingly stochastic
RBD movement has been captured during structural characterization of the closely related beta
coronaviruses SARS-CoV and MERS-CoV, as well as the more distantly related alphacoronavirus
porcine epidemic diarrhea virus (PEDV) (10, 11, 13, 16). The observation of this phenomenon in 2019nCoV S suggests that it shares the same mechanism of triggering that is thought to be conserved among
the Coronaviridae, wherein receptor binding to exposed RBDs leads to an unstable three-RBD up
conformation that results in shedding of S1 and refolding of S2 (11, 12).
Because the S2 subunit appeared to be a symmetric trimer, we performed a 3D refinement imposing C3
symmetry, resulting in a 3.2-Å-resolution map with excellent density for the S2 subunit. Using both maps,
we built most of the 2019-nCoV S ectodomain, including glycans at 44 of the 66 N-linked glycosylation
sites per trimer (fig. S4). Our final model spans S residues 27 to 1146, with several flexible loops omitted.
Like all previously reported coronavirus S ectodomain structures, the density for 2019-nCoV S begins to
fade after the connector domain, reflecting the flexibility of the heptad repeat 2 domain in the prefusion
conformation (fig. S4A) (13, 16–18).
The overall structure of 2019-nCoV S resembles that of SARS-CoV S, with a root mean square deviation
(RMSD) of 3.8 Å over 959 Cα atoms (Fig. 2A). One of the larger differences between these two
structures (although still relatively minor) is the position of the RBDs in their respective down
conformations. Whereas the SARS-CoV RBD in the down conformation packs tightly against the Nterminal domain (NTD) of the neighboring protomer, the 2019-nCoV RBD in the down conformation is
angled closer to the central cavity of the trimer (Fig. 2B). Despite this observed conformational
difference, when the individual structural domains of 2019-nCoV S are aligned to their counterparts from
SARS-CoV S, they reflect the high degree of structural homology between the two proteins, with the
NTDs, RBDs, subdomains 1 and 2 (SD1 and SD2), and S2 subunits yielding individual RMSD values of
2.6 Å, 3.0 Å, 2.7 Å, and 2.0 Å, respectively (Fig. 2C).
Fig. 2 Structural comparison between 2019-nCoV S and SARS-CoV S.
(A) Single protomer of 2019-nCoV S with the RBD in the down conformation (left) is shown in ribbons
colored according to Fig. 1. A protomer of 2019-nCoV S in the RBD up conformation is shown (center)
next to a protomer of SARS-CoV S in the RBD up conformation (right), displayed as ribbons and colored
white (PDB ID: 6CRZ). (B) RBDs of 2019-nCoV and SARS-CoV aligned based on the position of the
adjacent NTD from the neighboring protomer. The 2019-nCoV RBD is colored green and the SARS-CoV
RBD is colored white. The 2019-nCoV NTD is colored blue. (C) Structural domains from 2019-nCoV S
have been aligned to their counterparts from SARS-CoV S as follows: NTD (top left), RBD (top right),
SD1, and SD2 (bottom left), and S2 (bottom right).
2019-nCoV S shares 98% sequence identity with the S protein from the bat coronavirus RaTG13, with the
most notable variation arising from an insertion in the S1/S2 protease cleavage site that results in an
“RRAR” furin recognition site in 2019-nCoV (19) rather than the single arginine in SARS-CoV (fig. S5)
(20–23). Notably, amino acid insertions that create a polybasic furin site in a related position in
hemagglutinin proteins are often found in highly virulent avian and human influenza viruses (24). In the
structure reported here, the S1/S2 junction is in a disordered, solvent-exposed loop. In addition to this
insertion of residues in the S1/S2 junction, 29 variant residues exist between 2019-nCoV S and RaTG13
S, with 17 of these positions mapping to the RBD (figs. S5 and S6). We also analyzed the 61 available
2019-nCoV S sequences in the Global Initiative on Sharing All Influenza Data database
(https://www.gisaid.org/) and found that there were only nine amino acid substitutions among all
deposited sequences. Most of these substitutions are relatively conservative and are not expected to have a
substantial effect on the structure or function of the 2019-nCoV S protein (fig. S6).
Recent reports demonstrating that 2019-nCoV S and SARS-CoV S share the same functional host cell
receptor, angiotensin-converting enzyme 2 (ACE2) (22, 25–27), prompted us to quantify the kinetics of
this interaction by surface plasmon resonance. ACE2 bound to the 2019-nCoV S ectodomain with ~15
nM affinity, which is ~10- to 20-fold higher than ACE2 binding to SARS-CoV S (Fig. 3A and fig. S7)
(14). We also formed a complex of ACE2 bound to the 2019-nCoV S ectodomain and observed it by
negative-stain EM, which showed that it strongly resembled the complex formed between SARS-CoV S
and ACE2 that has been observed at high resolution by cryo-EM (Fig. 3B) (14, 28). The high affinity of
2019-nCoV S for human ACE2 may contribute to the apparent ease with which 2019-nCoV can spread
from human to human (1); however, additional studies are needed to investigate this possibility.
Fig. 3 2019-nCoV S binds human ACE2 with high affinity.
(A) Surface plasmon resonance sensorgram showing the binding kinetics for human ACE2 and
immobilized 2019-nCoV S. Data are shown as black lines, and the best fit of the data to a 1:1 binding
model is shown in red. (B) Negative-stain EM 2D class averages of 2019-nCoV S bound by ACE2.
Averages have been rotated so that ACE2 is positioned above the 2019-nCoV S protein concerning the
viral membrane. A diagram depicting the ACE2-bound 2019-nCoV S protein is shown (right) with ACE2
in blue and S protein protomers colored tan, pink, and green.
The overall structural homology and shared receptor usage between SARS-CoV S and 2019-nCoV S
prompted us to test published SARS-CoV RBD-directed monoclonal antibodies (mAbs) for crossreactivity to the 2019-nCoV RBD (Fig. 4A). A 2019-nCoV RBD-SD1 fragment (S residues 319 to 591)
was recombinantly expressed, and appropriate folding of this construct was validated by measuring ACE2
binding using biolayer interferometry (BLI) (Fig. 4B). Cross-reactivity of the SARS-CoV RBD-directed
mAbs S230, m396, and 80R was then evaluated by BLI (12, 29–31). Despite the relatively high degree of
structural homology between the 2019-nCoV RBD and the SARS-CoV RBD, no binding to the 2019nCoV RBD could be detected for any of the three mAbs at the concentration tested (1 μM) (Fig. 4C), in
contrast to the strong binding that we observed to the SARS-CoV RBD (fig. S8). Although the epitopes of
these three antibodies represent a relatively small percentage of the surface area of the 2019-nCoV RBD,
the lack of observed binding suggests that SARS-directed mAbs will not necessarily be cross-reactive and
that future antibody isolation and therapeutic design efforts will benefit from using 2019-nCoV S proteins
as probes.
Fig. 4 Antigenicity of the 2019-nCoV RBD.
(A) SARS-CoV RBD is shown as a white molecular surface (PDB ID: 2AJF), with residues that vary in
the 2019-nCoV RBD colored red. The ACE2-binding site is outlined with a black dashed line. (B)
Biolayer interferometry sensorgram showing binding to ACE2 by the 2019-nCoV RBD-SD1. Binding
data are shown as a black line, and the best fit of the data to a 1:1 binding model is shown in red. (C)
Biolayer interferometry to measure cross-reactivity of the SARS-CoV RBD-directed antibodies S230,
m396, and 80R. Sensor tips with immobilized antibodies were dipped into wells containing 2019-nCoV
RBD-SD1, and the resulting data are shown as a black line[7].
Crystal structure of the 2019-nCoV spike receptor-binding domain bound with the
ACE2 receptor.
Phylogenetic analysis on the coronavirus genomes has revealed that 2019-nCoV is a new member of the
betacoronavirus genus, which includes SARS-CoV, MERS-CoV, bat SARS-related coronaviruses
(SARSr-CoV), as well as others identified in humans and diverse animal species1–3,7. Bat coronavirus
RaTG13 appears to be the closest relative of the 2019-nCoV sharing over 93.1% homology in the spike
(S) gene. SARS-CoV and other SARSr-CoVs, however, are rather distinct with less than 80%
homology1.
Coronaviruses utilize the homotrimeric spike glycoprotein (S1 subunit and S2 subunit in each spike
monomer) on the envelope to bind their cellular receptors. Such binding triggers a cascade events leading
to the fusion between a cell and viral membranes for cell entry. Our cryo-EM studies have shown that the
binding of the SARS-CoV spike to the cell receptor ACE2 induces the dissociation of the S1 with ACE2,
prompting the S2 to transition from a metastable prefusion to a more stable postfusion state that is
essential for membrane fusion8,9. Therefore, binding to the ACE2 receptor is a critical initial step for
SARS-CoV to entry into the target cells. Recent studies also pointed to the important role of ACE2 in
mediating entry of 2019-nCoV1,10. HeLa cells expressing ACE2 is susceptible to 2019-nCoV infection
while those without failed to do so1. In vitro SPR experiments also showed that the binding affinity of
ACE2 to the spike glycoprotein and the receptor-binding domain (RBD) are equivalent, with the former
of 14.7 nM and the latter of 15.2 nM11,12. These results indicate that the RBD is the key functional
component within the S1 subunit responsible for binding to ACE2.
The cryo-EM structure of the 2019-nCoV spike trimer at 3.5 Å resolution has just been reported12. The
coordinates are not yet available for detailed characterization. However, an inspection of the structure
features presented in the uploaded manuscript on bioRxiv indicated incomplete resolution of RBD in the
model, particularly for the receptor-binding motif (RBM) that interacts directly with ACE2. Computer
modeling of the interaction between 2019-CoV RBD and ACE2 has identified some residues potentially
involved in the actual interaction but the actual interaction remained elusive13. Furthermore, despite
impressive cross-reactive neutralizing activity from serum/plasma of SARS-CoV recovered patients14, no
SARS-CoV monoclonal antibodies targeted to RBD so far isolated can bind and neutralize 2019nCoV11,12. These findings highlight some intrinsic sequence and structure differences between the
SARS-CoV and 2019-nCoV RBDs.
The overall structure of 2019-nCoV RBD bound with ACE2
.
(a) The overall topology of 2019-nCoV spike monomer. NTD, N-terminal domain. RBD, receptorbinding domain. RBM, receptor-binding motif. SD1, subdomain 1. SD2, subdomain 2. FP, fusion
peptide. HR1, heptad repeat 1. HR2, heptad repeat 2. TM, transmembrane region. IC, intracellular
domain. (b) Sequence and secondary structures of 2019-nCoV RBD. The RBM is colored red. (c)
The overall structure of 2019-nCoV RBD bound with ACE2. ACE2 is colored green. 2019-nCoV
RBD core is colored cyan and RBM is colored red. Disulfide bonds in the 2019-nCoV RBD are
shown as the stick and indicated by yellow arrows. The N-terminal helix of ACE2 responsible for
binding is labeled.
The 2019-nCoV RBD has a twisted four-stranded antiparallel β sheet (β1, β2, β3, and β6) with
short connecting helices and loops forming as the core (Fig. 1b and 1c). Between the β3 and β6
strands in the core, there is an extended insertion containing short β4 and β5 strands, α4 and α5
helices and loops (Fig. 1b and 1c). This extended insertion is the receptor-binding motif (RBM)
containing most of the contacting residues of 2019-nCoV for ACE2 binding. A total of nine
cysteine residues are found in the RBD, six of which forming three pairs of disulfide bonds that
are resolved in the final model. Among these three pairs, two are in the core (Cys336-Cys361 and
Cys379-Cys432) to help stabilize the β sheet structure (Fig. 1c) while the remaining one
(Cys480-Cys488) connects loops in the distal end of the RBM (Fig. 1c). The N-terminal
peptidase domain of ACE2 has two lobes, forming the peptide substrate binding site between
them. The extended RBM in the 2019-nCoV RBD contacts the bottom side of the ACE2 small
lobe, with a concave outer surface in the RBM accommodating the N-terminal helix of the ACE2
Structural comparisons of 2019-nCoV and SARS-CoV RBDs and their
binding modes to the ACE2 receptor.
(a) Alignment of the 2019-nCoV RBD (core in cyan and RBM in red) and SARS-CoV RBD (core in
orange and RBM in blue) structures. (b) Structural alignment of 2019-nCoV RBD/ACE2 and
SARS-CoV RBD/ACE2 complexes. 2019-nCoV RBD is colored cyan and red, its interacting
ACE2 is colored green. SARS-CoV RBD is colored orange and blue, its interacting ACE2 is
colored salmon. The PDB code for SARS-CoV RBD/ACE2 complex.
The cradling of the ACE2 N-terminal helix by the RBM outer surface results in a large buried
surface of ~1700 Å2 between the 2019-nCoV RBD and ACE2 receptor (Fig. 1c). With a distance
cutoff of 4 Å, a total of 18 residues of the RBD contact 20 residues of the ACE2 (Fig.
3a and Table S2). Analysis of interface between SARS-CoV RBD and ACE2 revealed a total of
16 residues of the SARS-CoV RBD contact 20 residues of the ACE2 (Fig. 3a and Table S2).
Among the 20 residues interacting with the two different RBDs, 17 are shared and most of which
are located at the N-terminal helix (Fig. 2a). One prominent and common feature presented at
both interfaces is the networks of hydrophilic interactions. There are 17 hydrogen bonds and 1
salt bridge at the 2019-nCoV RBD/ACE2 interface, and 12 hydrogen bonds and 2 salt bridges at
the SARS-CoV RBD/ACE2 interface[8].
X-ray Structure of Main Protease of the Novel Coronavirus SARS-CoV-2 Enables
Design of α-Ketoamide Inhibitors
In the active site of 2019-nCoV Mpro, Cys145 and His41 form a catalytic dyad. Like in SARS-CoV Mpro and
other coronavirus homologs, a buried water molecule is found hydrogen-bonded to His41; this could be
considered the third component of a catalytic triad.
Previously, we have designed and synthesized peptidomimetic a-keto amides as broadspectrum inhibitors
of the main proteases of betacoronaviruses and alphacoronaviruses as well as the 3C proteases of
enteroviruses (Zhang et al., 2020). The best of these compounds (11r; see Scheme 1) showed an EC50
of 400 picomolar against MERS-CoV in Huh7 cells as well as low micromolar EC50 values against
SARS-CoV and a whole range of enteroviruses in various cell lines. To improve the half-life and the
solubility of the compounds in human plasma, and to reduce the binding to plasma proteins, we have
modified the compound by hiding the P3 - P2 amide bond within a pyridone ring and by replacing the
cinnamoyl group. For a compound related to 11r but modified this way (compound 13a), the half-life in
human plasma was increased by 50%, solubility was improved, and plasma protein binding was reduced
from 99% to 94%. There was no sign of toxicity in mice. In addition, 13a showed good metabolic
stability using mouse and human microsomes, with intrinsic clearance rates Clint_mouse= 32.00 μL/min/mg
protein and Clint_human= 20.97 μL/min/mg protein. This means that after 30 min, around 80% for mouse
and 60% for humans, respectively, of residual compound remained metabolically stable. Pharmacokinetic
studies in CD-1 mice using the subcutaneous route at 20 mg/kg showed that 13a stayed in plasma for up
to only 4 hrs, but was excreted via urine up to 24 hrs. The Cmax was determined at 334.50 ng/mL and the
mean residence time was about 1.59 hrs. Although 13a seemed to be cleared very rapidly from plasma, it
was found at 24 hrs at 135 ng/g tissue in the lung and at 52.7 ng/mL in broncheo-alveolar lavage fluid
(BALF) suggesting that it was mainly distributed to tissue. In light of the current CoV outbreak, it is
advisable to develop compounds with lung tropism such as 13a. However, compared to 11r, the structural
modification led to some loss of inhibitory activity against the main protease of 2019-nCoV (IC50 = 2.39
± 0.63 uM) as well as the 3C proteases of enteroviruses. To enhance the antiviral activity against
betacoronaviruses of clade b (2019-nCoV and SARS-CoV), we sacrificed the goal of broad-spectrum
activity including the enteroviruses for the time being and replaced the P2 cyclohexyl moiety of 13a by
cyclopropyl in 13b, because the S2 pocket of the betacoronavirus main proteases shows pronounced
plasticity enabling it to adapt to the shape of smaller inhibitor moieties entering this site (Zhang et al.,
2020).
Here we present X-ray crystal structures in two different crystal forms, at 1.95 and 2.20 Å resolution, of
the complex between α-ketoamide 13b optimized this way and the Mpro of 2019-nCoV (Fig. 2). One
structure is in space group C2, where both protomers of the Mpro dimer are bound by crystal symmetry to
have identical conformations, the other is in space group P212121, where the two protomers are
independent of each other and free to adopt different conformations. Indeed, we find that in the latter
crystal structure, the key residue Glu166 adopts an inactive conformation (as evidenced by its prolonged
distance from His172 and the lack of H-bonding interaction between Glu166 and the P1 moiety of the
inhibitor (see below)), even though compound 13b is bound in the same mode as in molecule A. This
phenomenon has also been observed, in a more pronounced form, with the SARS-CoV Mpro (Yang et al.,
2003) and is consistent with the half-site activity described for this enzyme (Chen et al., 2006). In all
copies of the inhibited 2019-nCoV Mpro, the inhibitor binds to the shallow substrate-binding site at the
surface of each protomer, between domains I and II (Fig. 2).
Chemical structures of α-ketoamide inhibitors 11r, 13a, and 13b
Fig. 2:
Compound 13b in the substrate-binding cleft located between domains I and II of the Mpro, in the
monoclinic crystal form (space group C2). 2Fo-Fc electron density is shown for the inhibitor (contouring
level: 1σ). Carbon atoms of the inhibitor are magenta, oxygens red, nitrogens blue, and sulfur yellow.
Note the interaction between the N-terminal residue of chain B, S1*, and E166 of chain A.
Through the nucleophilic attack of the catalytic Cys145 onto the a-keto group of the inhibitor, a
thiohemiketal is formed in a reversible reaction. This is reflected in the electron density (Fig. 2);
the stereochemistry of this chiral moiety is S in all three copies of compound 13b in these
structures. The oxyanion (or hydroxyl) group of this thiohemiketal is stabilized by a hydrogen
bond from His41, whereas the amide oxygen of 13b accepts a hydrogen bond from the main-chain
amides of Gly143, Cys145, and partly Ser144, which form the canonical “oxyanion hole” of the
cysteine protease[9].
Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein.
MERS-CoV was suggested to originate from bats but the reservoir host fueling spillover to humans is
unequivocally dromedary camels (Haagmans et al., 2014; Memish et al., 2013). Both SARS-CoV and
SARS-CoV-2 are closely related to each other and originated in bats which most likely serve as a
reservoir host for these two viruses (Ge et al., 2013; Hu et al., 2017; Li et al., 2005b; Yang et al.,
2015a; Zhou et al., 2020). Whereas palm civets and raccoon dogs have been recognized as an
intermediate host for zoonotic transmission of SARS-CoV between bats and humans (Guan et al.,
2003; Kan et al., 2005; Wang et al., 2005), the intermediate host of SARS-CoV-2 remains unknown.
The recurrent spillovers of coronaviruses in humans along with detection of numerous coronaviruses in
bats, including many SARS-related coronaviruses (SARSr-CoVs), suggest that future zoonotic
transmission events may continue to occur (Anthony et al., 2017; Ge et al., 2013; Hu et al., 2017; Li et
al., 2005b; Menachery et al., 2015; Menachery et al., 2016; Yang et al., 2015a; Zhou et al., 2020). In
addition to the highly pathogenic zoonotic pathogens SARS-CoV, MERS-CoV, and SARS-CoV-2, all
belonging to the β-coronavirus genus, four low pathogenicity coronaviruses are endemic in humans:
HCoV-OC43, HCoV-HKU1, HCoV-NL63, and HCoV-229E.To date, no therapeutics or vaccines are
approved against any human infecting coronaviruses.
Coronavirus entry into host cells is mediated by the transmembrane spike (S) glycoprotein that forms
homotrimers protruding from the viral surface (Tortorici and Veesler, 2019). S comprises two functional
subunits responsible for binding to the host cell receptor (S1 subunit) and fusion of the viral and cellular
membranes (S2 subunit). For many CoVs, S is cleaved at the boundary between the S1 and S2 subunits
which remain non-covalently bound in the prefusion conformation (Belouzard et al., 2009; Bosch et al.,
2003; Burkard et al., 2014; Kirchdoerfer et al., 2016; Millet and Whittaker, 2014, 2015; Park et al.,
2016; Walls et al., 2016a). The distal S1 subunit comprises the receptor-binding domain(s) and
contributes to the stabilization of the prefusion state of the membrane-anchored S2 subunit that contains
the fusion machinery (Gui et al., 2017; Kirchdoerfer et al., 2016; Pallesen et al., 2017; Song et al.,
2018; Walls et al., 2016a; Walls et al., 2017b; Yuan et al., 2017). For all CoVs, S is further cleaved by
host proteases at the so-called S2’ site located immediately upstream of the fusion peptide (Madu et al.,
2009; Millet and Whittaker, 2015). This cleavage has been proposed to activate the protein for
membrane fusion via extensive irreversible conformational changes (Belouzard et al., 2009; HealdSargent and Gallagher, 2012; Millet and Whittaker, 2014, 2015; Park et al., 2016; Walls et al.,
2017b). As a result, coronavirus entry into susceptible cells is a complex process that requires the
concerted action of receptor-binding and proteolytic processing of the S protein to promote virus-cell
fusion.
Different coronaviruses use distinct domains within the S1 subunit to recognize a variety of attachment
and entry receptors, depending on the viral species. Endemic human coronaviruses OC43 and HKU1
attach via their S domain A (SA) to 5-N-acetyl-9-O-acetyl-sialosides found on glycoproteins and
glycolipids at the host cell surface to enable entry into susceptible cells (Hulswit et al., 2019; Tortorici
et al., 2019; Vlasak et al., 1988). MERS-CoV S, however, uses domain A to engage non-acetylated
sialosides as attachment receptors (Li et al., 2017; Park et al., 2019) and promote subsequent binding of
domain B (SB) to the entry receptor, dipeptidyl-peptidase 4 (Lu et al., 2013; Raj et al., 2013). SARS- and
several SARS-related coronaviruses (SARSr-CoV) interact directly with angiotensin-converting enzyme
2 (ACE2) via SB to enter target cells (Ge et al., 2013; Kirchdoerfer et al., 2018; Li et al., 2005a; Li et
al., 2003; Song et al., 2018; Yang et al., 2015a).
As the coronavirus S glycoprotein is surface-exposed and mediates entry into host cells, it is the main
target of neutralizing antibodies (Abs) upon infection and the focus of therapeutic and vaccine design. S
trimers are extensively decorated with N-linked glycans that are important for proper folding (Rossen et
al., 1998) and to modulate accessibility to host proteases and neutralizing antibodies (Walls et al.,
2016b; Walls et al., 2019; Xiong et al., 2017; Yang et al., 2015b). We previously characterized potent
human neutralizing Abs from rare memory B cells of individuals infected with SARS-CoV (Traggiai et
al., 2004) or MERS-CoV (Corti et al., 2015) in complex with SARS-CoV S and MERS-CoV S to
provide molecular-level information of the mechanism of competitive inhibition of SB attachment to the
host receptor (Walls et al., 2019). The S230 anti-SARS-CoV antibody also acted by functionally
mimicking receptor-attachment and promoting spike fusogenic conformational rearrangements through a
ratcheting mechanism that elucidated the unique nature of the coronavirus membrane fusion activation
(Walls et al., 2019)[10].
We report here that ACE2 could mediate SARS-CoV-2 S-mediated entry into cells, establishing it as a
functional receptor for this newly emerged coronavirus. The SARS-CoV-2 SB engages human ACE2
(hACE2) with comparable affinity than SARS-CoV SB from viral isolates associated with the 2002-2003
epidemic (i.e. binding with high affinity to hACE2). Tight binding to hACE2 could explain the efficient
transmission of SARS-CoV-2 in humans, as was the case for SARS-CoV. We identified the presence of
an unexpected furin cleavage site at the S1/S2 boundary of SARS-CoV-2 S, which is cleaved during
biosynthesis, a novel feature setting this virus apart from SARS-CoV and SARSr-CoVs. Abrogation of
this cleavage motif moderately affected SARS-CoV-2 S-mediated entry into VeroE6 or BHK cells but
may contribute to expanding the tropism of this virus, as reported for several highly pathogenic avian
influenza viruses and pathogenic Newcastle disease virus (Klenk and Garten, 1994; Steinhauer, 1999).
We determined a cryo-electron microscopy structure of the SARS-CoV-2 S ectodomain trimer and reveal
that it adopts multiple SB conformations that are reminiscent of previous reports on both SARS-CoV S
and MERS-CoV S. Finally, we show that SARS-CoV S mouse polyclonal sera potently inhibited entry
into target cells of SARS-CoV-2 S pseudotyped viruses. Collectively, these results pave the way for
designing vaccines eliciting broad protection against SARS-CoV-2, SARS-CoV, and SARSr-CoV.
RESULTS
ACE2 is an entry receptor for SARS-CoV-2
The SARS-CoV-2 S glycoprotein shares ~80% amino acid sequence identity with the SARS-CoV S
Urbani and with bat SARSr-CoV ZXC21 S and ZC45 S glycoprotein. The latter two SARSr-CoV
sequences were identified from Rinolophus sinicus (Chinese horseshoe bats), the species from which
SARSr-CoV WIV-1 and WIV-16 were isolated (Ge et al., 2013; Yang et al., 2015a). Furthermore,
Zhou et al recently reported that SARS-CoV-2 is most closely related to the bat SARSr-CoV RaTG13
with which it forms a distinct lineage from other SARSr-CoVs, and that their S glycoproteins share 98%
amino acid sequence identity (Zhou et al., 2020). SARS-CoV recognizes its entry receptor human ACE2
(hACE2) at the surface of type II pneumocytes, using SB which shares ~75% overall amino acid sequence
identity with SARS-CoV-2 SB and 50% identity within their receptor-binding motifs (RBMs) (Li et al.,
2005a; Li et al., 2003; Li et al., 2005c; Wan et al., 2020). Previous studies also showed that the host
proteases cathepsin L and TMPRSS2 prime SARS-CoV S for membrane fusion through cleavage at the
S1/S2 and at the S2’ sites (Belouzard et al., 2009; Bosch et al., 2008; Glowacka et al., 2011; Matsuyama
et al., 2010; Millet and Whittaker, 2015; Shulla et al., 2011).
We set out to investigate the functional determinants of S-mediated entry into target cells using a murine
leukemia virus (MLV) pseudotyping system (Millet and Whittaker, 2016). To assess the ability of
SARS-CoV-2 S to promote entry into target cells, we first compared the transduction of SARS-CoV-2 SMLV and SARS-CoV S-MLV into VeroE6 cells, that are known to express ACE2 and support SARSCoV replication (Drosten et al., 2003; Ksiazek et al., 2003). Both pseudoviruses entered cells equally
well (Fig. 1 A), suggesting that SARS-CoV-2 S-MLV could potentially use African green monkey ACE2
as an entry receptor. To confirm these results, we evaluated entry into BHK cells and observed that
transient transfection with hACE2 rendered them susceptible to transduction with SARS-CoV-2 SMLV (Fig. 1 B). These results demonstrate hACE2 is a functional receptor for SARS-CoV-2, in
agreement with recently reported findings (Hoffmann et al., 2020; Letko and Munster, 2020; Zhou et
al., 2020).
•
Download figure
•
Open in new tab
Figure 1.hACE2 is a functional receptor for SARS-CoV-2 S.
A. Entry of MLV pseudotyped with SARS-CoV-2 S, SARS-CoV-2 Sfur/mut and SARS-CoV S in VeroE6
cells. B. Entry of MLV pseudotyped with SARS-CoV-2 S or SARS-CoV-2 Sfur/mut in BHK cells
transiently transfected with hACE2. The experiments were carried out in triplicate with two independent
pseudovirus preparations and a representative experiment is shown. C. Sequence alignment of SARSCoV-2 S with multiple related SARS-CoV and SARSr-CoV S glycoproteins reveals the introduction of
an S1/S2 furin cleavage site in this novel coronavirus. Identical and similar positions are respectively
shown with white or red font. The four amino acid residue insertion at SARS-CoV-2 S positions 690-693
is indicated with periods. The entire sequence alignment is presented in Fig. S1. D. Western blot analysis
of SARS-CoV-2 S-MLV, SARS-CoV-2 Sfur/mut-MLV and SARS-CoV S-MLV pseudovirions using an
anti-SARS-CoV S2 antibody.
Sequence analysis of SARS-CoV-2 S reveals the presence of a four amino acid residue insertion at the
boundary between the S1 and S2 subunits compared to SARS-CoV S and SARSr-CoV S (Fig. 1 C). This
results in the introduction of a furin cleavage site, a feature conserved among the 103 SARS-CoV-2
isolates sequenced to date but not in the closely related RaTG13 S (Zhou et al., 2020). Using Western
blot analysis, we observed that SARS-CoV-2 S was virtually entirely processed at the S1/S2 site during
biosynthesis in HEK293T cells, presumably by furin in the Golgi compartment (Fig. 1 D). This
observation contrasts with SARS-CoV S which was incorporated into pseudovirions largely
uncleaved (Fig. 1 D). To study the influence on pseudovirus entry of the SARS-CoV-2 S1/S2 furin
cleavage site, we designed an S mutant lacking the four amino acid residue insertion and the furin
cleavage site by mutating Q686TNSPRRAR↓SV696 (wildtype SARS-CoV-2 S) to
Q686TILR↓SV692 (SARS-CoV-2 Sfur/mut). SARS-CoV-2 Sfur/mut preserves only the conserved Arg residue at
position 994 of wildtype SARS-CoV-2 S thereby mimicking the S1/S2 cleavage site of the related SARSrCoV S CZX21 (Fig. 1 D). SARS-CoV-2 Sfur/mut is therefore expected to undergo processing at the
S1/S2 site upon encounter of a target cell, similar to SARS-CoV S and SARSr-CoV S (i.e. via TMPRSS2
and/or cathepsin L). As expected, SARS-CoV-2 Sfur/mut-MLV harbored uncleaved S upon budding (Fig. 1
D). The observed transduction efficiency of VeroE6 cells was higher for SARS-CoV-2 Sfur/mut-MLV than
for SARS-CoV-2 S-MLV Fig. 1 A) whereas the opposite trend was observed for transduction of hACE2expressing BHK cells (Fig. 1 B). These results suggest that S1/S2 cleavage during S biosynthesis was not
necessary for S-mediated entry in the conditions of our experiments (Fig. 1 C-D). We speculate that the
detection of a polybasic cleavage site in the fusion glycoprotein of SARS-CoV-2 could putatively expand
its tropism and/or enhance its transmissibility, compared to SARS-CoV and SARSr-CoV isolates, due to
the near-ubiquitous distribution of furin-like proteases and their reported effects on other viruses (Klenk
and Garten, 1994; Millet and Whittaker, 2015; Steinhauer, 1999).
SARS-CoV-2 recognizes human ACE2 with comparable affinity than SARS-CoV.
The binding affinity of SARS-CoV for hACE2 correlates with the overall rate of viral replication in
distinct species, transmissibility and disease severity (Guan et al., 2003; Li et al., 2004; Li et al.,
2005c; Wan et al., 2020). Indeed, specific SB mutations enabled efficient binding to hACE2 of SARSCoV isolates from the three phases of the 2002-2003 epidemic, which were associated with marked
disease severity (Consortium, 2004; Kan et al., 2005; Li et al., 2005c; Sui et al., 2004). In contrast,
SARS-CoV isolates detected during the brief 2003-2004 re-emergence interacted more weakly with
hACE2, but tightly with civet ACE2, and had low pathogenicity and transmissibility (Consortium,
2004; Kan et al., 2005; Li et al., 2005c).
The architecture of the SARS-CoV-2 spike glycoprotein trimer
To enable single-particle cryoEM study of the SARS-CoV-2 S glycoprotein, we designed a prefusion
stabilized ectodomain trimer construct with an abrogated furin S1/S2 cleavage site (Tortorici et al.,
2019; Walls et al., 2017a; Walls et al., 2016a; Walls et al., 2019), two consecutive proline stabilizing
mutations (Kirchdoerfer et al., 2018; Pallesen et al., 2017) and a C-terminal fold on trimerization
domain (Miroshnikov et al., 1998). 3D classification of the cryoEM data revealed the presence of
multiple conformational states of SARS-CoV-2 S corresponding to distinct organization of the
SB domains within the S1 apex. Approximately half of the particle images selected correspond to trimers
harboring a single SB domain opened whereas the remaining half was accounted for by closed trimers
with the 3 SB domains closed. The observed conformational variability of SB domains is reminiscent of
observations made with SARS-CoV S and MERS-CoV S trimers although we did not detect trimers with
two SB domains open and the distribution of particles across the S conformational landscape varies among
studies (Gui et al., 2017; Kirchdoerfer et al., 2018; Pallesen et al., 2017; Song et al., 2018; Walls et
al., 2019; Yuan et al., 2017).
We determined reconstruction of the closed SARS-CoV-2 S ectodomain trimer at 3.3 Å resolution
(applying 3-fold symmetry) and an asymmetric reconstruction of the trimer with a single SB domain
opened at 3.7 Å resolution (Fig 3 A-H, Fig S2, Table 2). The S2 fusion machinery is the best-resolved
part of the map whereas the SA and SB domains are less well resolved, presumably due to conformational
heterogeneity. The final atomic model comprises residues 36-1157, with internal breaks corresponding to
flexible regions, and lacks the C-terminal most segment (including the heptad repeat 2) which is not
visible in the map, as is the case for all S structures determined to date.
Figure 3.CryoEM structures of the SARS-CoV-2 S glycoprotein.
A-B. Two orthogonal views of the closed SARS-CoV-2 S trimer cryoEM map. C. Atomic model of the
closed SARS-CoV-2 S trimer in the same orientation as in panel A. D-E. Two orthogonal views of the
partially open SARS-CoV-2 S trimer cryoEM map (one SB domain is open). F. Atomic model of the
closed SARS-CoV-2 S trimer in the same orientation as in panel D. The glycans were omitted for clarity.
References.
[1]
“Genomic epidemiology of novel coronavirus (hCoV-19),” 2020. [Online]. Available:
https://nextstrain.org/ncov?dmax=2020-01-02&dmin=2019-12-26&l=radial.
[2]
D. L. R. Oscar A. MacLean*, Richard Orton, Joshua B. Singer and M.-U. of G. C. for V. R.
(CVR)., “Response to ‘On the origin and continuing evolution of SARS-CoV-2,’” 2020. [Online].
Available: http://virological.org/t/response-to-on-the-origin-and-continuing-evolution-of-sars-cov2/418.
[3]
N. E. Tran Thi Nhu Thao, Fabien Labroussaa, “Rapid reconstruction of SARS-CoV-2 using a
synthetic genomics platform,” 2020.
[4]
J. L. Xiaolu Tang, Changcheng Wu, Xiang Li, Yuhe Song, Xinmin Yao, Xinkai Wu, Yuange
Duan, Hong Zhang, Yirong Wang, Zhaohui Qian, Jie Cui, “On the origin and continuing evolution
of SARS-CoV-2,” 2020.
[5]
U. of Glasgow, “Amino acid replacements,” 2020.
[6]
M. * Prof Chaolin Huang et al., “Clinical features of patients infected with 2019 novel coronavirus
in Wuhan, China,” 2020.
[7]
J. S. M. Daniel Wrapp1,*, Nianshuang Wang1,*, Kizzmekia S. Corbett2, Jory A. Goldsmith1,
Ching-Lin Hsieh1, Olubukola Abiona2, Barney S. Graham2, “Cryo-EM structure of the 2019nCoV spike in the prefusion conformation,” 2020.
[8]
X. W. Jun Lan, Jiwan Ge, Jinfang Yu, Sisi Shan, Huan Zhou, Shilong Fan, Qi Zhang, Xuanling
Shi, Qisheng Wang, Linqi Zhang, “Crystal structure of the 2019-nCoV spike receptor-binding
domain bound with the ACE2 receptor,” 2020.
[9]
V. O. P. H. Linlin Zhang, Daizong Lin, Xinyuanyuan Sun, Katharina Rox, “X-ray Structure of
Main Protease of the Novel Coronavirus SARS-CoV-2 Enables Design of α-Ketoamide
Inhibitors,” 2020.
[10]
D. V. Alexandra C. Walls, Young-Jun Park, M. Alexandra Tortorici, Abigail Wall, Andrew T.
McGuire, “Structure, function and antigenicity of the SARS-CoV-2 spike glycoprotein,” 2020.