Duons

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Exonic Transcription Factor Binding Directs Codon Choice and

Affects Protein Evolution


Andrew B. Stergachis et al.
Science 342, 1367 (2013);
DOI: 10.1126/science.1243490

This copy is for your personal, non-commercial use only.

If you wish to distribute this article to others, you can order high-quality copies for your

Downloaded from www.sciencemag.org on December 13, 2013


colleagues, clients, or customers by clicking here.
Permission to republish or repurpose articles or portions of articles can be obtained by
following the guidelines here.

The following resources related to this article are available online at


www.sciencemag.org (this information is current as of December 12, 2013 ):

Updated information and services, including high-resolution figures, can be found in the online
version of this article at:
http://www.sciencemag.org/content/342/6164/1367.full.html
Supporting Online Material can be found at:
http://www.sciencemag.org/content/suppl/2013/12/11/342.6164.1367.DC1.html
A list of selected additional articles on the Science Web sites related to this article can be
found at:
http://www.sciencemag.org/content/342/6164/1367.full.html#related
This article cites 61 articles, 32 of which can be accessed free:
http://www.sciencemag.org/content/342/6164/1367.full.html#ref-list-1
This article has been cited by 1 articles hosted by HighWire Press; see:
http://www.sciencemag.org/content/342/6164/1367.full.html#related-urls

Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the
American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. Copyright
2013 by the American Association for the Advancement of Science; all rights reserved. The title Science is a
registered trademark of AAAS.
REPORTS
References and Notes 13. P. J. Gerrish, R. E. Lenski, Genetica 102-103, 127–144 (DBI-0939454), and by funds from the Hannah Chair
1. R. E. Lenski, M. R. Rose, S. C. Simpson, S. C. Tadler, (1998). Endowment at Michigan State University. We thank three
Am. Nat. 138, 1315–1341 (1991). 14. M. Hegreness, N. Shoresh, D. Hartl, R. Kishony, Science reviewers for comments; I. Dworkin, J. Krug, A. McAdam,
2. C. L. Burch, L. Chao, Genetics 151, 921–927 (1999). 311, 1615–1617 (2006). C. Wilke, and L. Zaman for discussions; and N. Hajela for
3. D. M. Weinreich, N. F. Delaney, M. A. Depristo, 15. S.-C. Park, J. Krug, Proc. Natl. Acad. Sci. U.S.A. 104, technical assistance. R.E.L. will make strains available to
D. L. Hartl, Science 312, 111–114 (2006). 18135–18140 (2007). qualified recipients, subject to completion of a material
4. S. Kryazhimskiy, G. Tkačik, J. B. Plotkin, Proc. Natl. Acad. 16. G. I. Lang et al., Nature 500, 571–574 (2013). transfer agreement that can be found at www.technologies.
Sci. U.S.A. 106, 18638–18643 (2009). 17. S. Wielgoss et al., Proc. Natl. Acad. Sci. U.S.A. 110, msu.edu/inventors/mta-cda/mta/mta-forms. Datasets and
5. H.-H. Chou, H.-C. Chiu, N. F. Delaney, D. Segrè, 222–227 (2013). analysis scripts are available at the Dryad Digital Repository
C. J. Marx, Science 332, 1190–1192 (2011). 18. R. J. Woods et al., Science 331, 1433–1436 (2011). (doi:10.5061/dryad.0hc2m).
6. A. I. Khan, D. M. Dinh, D. Schneider, R. E. Lenski, 19. M. M. Desai, D. S. Fisher, A. W. Murray, Curr. Biol. 17,
T. F. Cooper, Science 332, 1193–1196 (2011). 385–394 (2007).
7. I. G. Szendro, M. F. Schenk, J. Franke, J. Krug, 20. F. Vasi, M. Travisano, R. E. Lenski, Am. Nat. 144, Supplementary Materials
J. A. G. M. de Visser, J. Stat. Mech. 2013, P01005 (2013). 432–456 (1994). www.sciencemag.org/content/342/6164/1364/suppl/DC1
8. T. J. Kawecki et al., Trends Ecol. Evol. 27, 547–560 21. R. G. Eagon, J. Bacteriol. 83, 736–737 (1962). Materials and Methods
(2012). 22. S. Goyal et al., Genetics 191, 1309–1319 (2012). Supplementary Text
9. R. E. Lenski, M. Travisano, Proc. Natl. Acad. Sci. U.S.A. 23. S. Wielgoss et al., G3 (Bethesda) 1, 183–186 (2011). Figs. S1 to S7
91, 6808–6814 (1994). 24. S. F. Elena, R. E. Lenski, Evolution 51, 1058–1067 (1997). Tables S1 to S4
10. J. E. Barrick et al., Nature 461, 1243–1247 (2009). 25. C. E. Paquin, J. Adams, Nature 306, 368–370 (1983). References (26–40)
11. Materials and methods and supplementary text are
available as supporting material on Science Online. Acknowledgments: This work was supported by grants from 17 July 2013; accepted 4 November 2013
12. P. Sibani, M. Brandt, P. Alstrøm, Intl. J. Mod. Phys. 12, the National Science Foundation (DEB-1019989) including Published online 14 November 2013;
361–391 (1998). the BEACON Center for the Study of Evolution in Action 10.1126/science.1243357

Exonic Transcription Factor Binding (Fig. 1, A and B; fig. S1A; and table S1). Ap-
proximately 14% of all human coding bases con-
tact a TF in at least one cell type (average 1.1%
Directs Codon Choice and Affects per cell type) (Fig. 1C and fig. S1B), and 86.9%
of genes contained coding TF footprints (average
Protein Evolution 33% per cell type) (fig. S1, C and D).
The exonic TF footprints we observed likely
underestimate the true fraction of protein-coding
Andrew B. Stergachis,1 Eric Haugen,1 Anthony Shafer,1 Wenqing Fu,1 Benjamin Vernot,1 bases that contact TFs because (i) TF footprint
Alex Reynolds,1 Anthony Raubitschek,2,3 Steven Ziegler,3 Emily M. LeProust,4* detection increases substantially with sequencing
Joshua M. Akey,1 John A. Stamatoyannopoulos1,5† depth (13), and (ii) the 81 cell types sampled, al-
though extensive, is far from complete; we saw
Genomes contain both a genetic code specifying amino acids and a regulatory code specifying little evidence of saturation of coding TF footprint
transcription factor (TF) recognition sequences. We used genomic deoxyribonuclease I footprinting discovery (fig. S2).
to map nucleotide resolution TF occupancy across the human exome in 81 diverse cell types. To ascertain coding footprints more completely,
We found that ~15% of human codons are dual-use codons (“duons”) that simultaneously specify we developed an approach for targeted exonic
both amino acids and TF recognition sites. Duons are highly conserved and have shaped protein footprinting via solution-phase capture of DNaseI-
evolution, and TF-imposed constraint appears to be a major driver of codon usage bias. Conversely, seq libraries using RNA probes complementary to
the regulatory code has been selectively depleted of TFs that recognize stop codons. More than human exons (19). Targeted capture footprinting of
17% of single-nucleotide variants within duons directly alter TF binding. Pervasive dual exons from abdominal skin and mammary stromal
encoding of amino acid and regulatory information appears to be a fundamental feature of fibroblasts yielded ~10-fold increases in DNaseI
genome evolution. cleavage—equivalent to sequencing >4 billion reads
per sample by using conventional genomic foot-
he genetic code, common to all organisms, malian genomes (7–11), which appear to be under printing (fig. S3A)—quantitatively exposing many

T contains extensive redundancy, in which


most amino acids can be specified by two
to six synonymous codons. The observed ratios
selection (12).
Genomes also contain a parallel regulatory
code specifying recognition sequences for tran-
additional TF footprints (fig. S3, B to D). Overall,
we identified an average of ~175,000 coding foot-
prints per cell type (fig. S1E), which is 7- to 12-fold
of synonymous codons are highly nonrandom, scription factors (TFs) (13), and the genetic and more than with conventional footprinting.
and codon usage biases are fixtures of both pro- regulatory codes have been assumed to operate Although coding sequences are densely oc-
karyotic and eukaryotic genomes (1). In orga- independently of one another and to be segre- cupied by TFs in vivo, the density of TF footprints
nisms with short life spans and large effective gated physically into the coding and noncoding at different genic positions varied widely, with
population sizes, codon biases have been linked genomic compartments. However, the potential many genes exhibiting sharply increased density
to translation efficiency and mRNA stability (2–7). for some coding exons to accommodate transcrip- in the translated portion of their first coding exon
However, these mechanisms explain only a small tional enhancers or splicing signals has long been (Fig. 1D and fig. S4A). In contrast, internal coding
fraction of observed codon preferences in mam- recognized (14–18). exons were as likely as flanking intronic sequences
To define intersections between the regula- to harbor TF footprints (Fig. 1D). The total num-
1
Department of Genome Sciences, University of Washington, tory and genetic codes, we generated nucleotide- ber of coding DNaseI footprints within a gene
Seattle, WA 98195, USA. 2Department of Immunology, Uni- resolution maps of TF occupancy in 81 diverse was related both to the length of the gene and to
versity of Washington, Seattle, WA 98109, USA. 3Benaroya human cell types using genomic deoxyribonuclease its expression level (fig. S4, B to D).
Research Institute, Seattle, WA 98101, USA. 4Agilent Technol- I (DNaseI) footprinting (19). Collectively, we de- Given their abundance, we sought to deter-
ogies, Santa Clara, CA 95051, USA. 5Department of Medicine,
University of Washington, Seattle, WA 98195, USA. fined 11,598,043 distinct 6– to 40–base pair (bp) mine whether exonic TF binding elements were
*Present address: Twist Bioscience, San Francisco, CA 94158,
footprints genome-wide (~1,018,514 per cell type), under evolutionary selection. Fourfold degener-
USA. 216,304 of which localized completely within ate coding bases are frequently used as a model
†Corresponding author. E-mail: [email protected] protein-coding exons (~24,842 per cell type) of neutral (or nearly neutral) evolution (20) but

www.sciencemag.org SCIENCE VOL 342 13 DECEMBER 2013 1367


REPORTS
may exhibit constraint when a functional signal tified the age of mutations arising within or out- Both synonymous and nonsynonymous mu-
impinges on coding sequence (11). Across the side of coding footprints using exome sequencing tations within coding footprints were signifi-
coding compartment, fourfold degenerate bases data from 4298 individuals of European ancestry cantly younger than those outside of footprints
(4FDBs) within TF footprints show significant- (fig. S5C) and 2217 individuals of African Amer- (Fig. 1F and fig. S5E), indicating that coding TF
ly greater evolutionary constraint versus non- ican ancestry (fig. S5D) (21). This analysis re- binding constrains both codon and amino acid
footprinted 4FDBs (Fig. 1E and fig. S5, A and B), vealed that mutations within coding footprints evolution. The genome-wide recognition sequence
indicating that TF-DNA recognition constrains were on average 10.2% younger than those out- landscape of each TF has evolved to fit the mo-
the third codon position. side of footprints (Fig. 1F and fig. S5E), signal- lecular topography of its protein-DNA binding
To test for evolutionary constraint at coding ing influence of coding TF elements on human interface (Fig. 1G) (13). To study how specific
footprints in modern human populations, we quan- fitness. TFs influence codon and amino acid choice at

Fig. 1. TFs densely populate and evolutionarily constrain protein-coding synonymous (brown), and nonsynonymous (red) coding SNVs (European) within
exons. (A) Distribution of DNaseI footprints. (B) Per-nucleotide DNaseI cleavage and outside footprints [P values per (21)] (G) Structure of DNA-bound KLF4 versus
and chromatin immunoprecipitation sequencing (ChIP-seq) signal for coding CTCF average per-nucleotide DNaseI cleavage and evolutionary constraint at KLF4
(left) and NRSF (right) binding elements. (C) Proportion of coding bases within footprints. (H) Average per-nucleotide conservation at 4FDBs (brown) and NDBs
DNaseI footprints in each of 81 cell types (left), or any cell type (right). (D) Average (red) overlapping KLF4 (left) and NFIC (right) footprints [r, Pearson correlation;
footprint density within first, internal, or final coding exons [mean T SEM; P value, conservation at promoter bases versus 4FDBs (top) or NDBs (bottom)]. (I) Evo-
paired t test; nonsignificant (n.s.) indicates P > 0.1]. (E) PhyloP conservation at lutionary constraint imparted by 63 TFs at promoter elements, 4FDBs and NDBs
4FDBs within and outside footprints. (F) Estimated mutational age at all (gray), (Pearson correlations).

1368 13 DECEMBER 2013 VOL 342 SCIENCE www.sciencemag.org


REPORTS
their recognition sites, we compared the per- that 73% constrain 4FDBs and 51% constrain (a 9.6% occupancy bias toward the preferred
nucleotide evolutionary conservation profiles of NDBs (Fig. 1I and figs. S6 and S7). Thus, indi- codon) (Fig. 2A). Apart from arginine (see be-
TF recognition sequences at noncoding 4FDBs vidual TFs may influence both codon and amino low), for all amino acids encoded by two or
and nondegenerate coding bases (NDBs). For ex- acid choice. more codons the codon that is preferentially used
ample, the conservation profiles at 4FDBs and To examine how TF binding relates to codon genome-wide is also preferentially occupied by
NDBs at KLF4 and NFIC recognition sites close- usage patterns, we examined binding at preferred TFs (Fig. 2B and table S2).
ly mirror those of recognition sites in noncoding (biased) versus nonpreferred codons. For exam- To determine whether preferential occupancy
regions (promoter) (Fig. 1H). As such, these TFs ple, across all human proteins asparagine is en- of biased codons is inherent to TF recognition
constrain both codon choice (via constraint on coded by the AAC codon 52% of the time (versus sequences, we compared trinucleotide frequen-
4FDBs) and amino acid choice (via NDBs) en- AAT, 48%), indicating a generalized 4% bias cies within coding versus noncoding footprints.
coded at their recognition sites. Analysis of in favor of this codon. However, genome-wide, Trinucleotide combinations favored by TFs with-
conservation profiles for 63 TFs with prevalent 60.4% of asparagine codons within footprints are in coding sequence were equivalent to those fa-
occupancy within coding regions (19) showed AAC, versus only 50.8% outside of footprints vored in noncoding sequence (Fig. 3C), indicating

Fig. 2. TFs modulate global codon biases. (A) Proportions of all codons (gray), of each codon trinucleotide in coding versus noncoding regions (C, coding; NC,
or codons outside of (yellow) or within (purple) footprints, that encode asparagine noncoding). (D) Difference in average evolutionary constraint at third positions of
(top) or leucine (bottom). Codons with bias (AAC for asparagine and CTG for leucine) biased codons outside versus within footprints (P values, Mann-Whitney test). (E)
preferentially localize within footprints. (B) Preferential footprinting of biased Proportions of amino acids encoded by CpG-containing codons among all codons
codons, calculated as in (A) (P values, Pearson’s c2 test). (C) Preferential footprinting (gray), codons outside footprints (yellow), or codons within footprints (purple).

www.sciencemag.org SCIENCE VOL 342 13 DECEMBER 2013 1369


REPORTS
that global TF binding preferences are directly codon biases have not changed substantially since Conversely, TFs involved in modulating pro-
reflected in the frequency of different codons. Of the divergence of humans and mice (fig. S8), moter activity, such as YY1 and NRSF, prefer-
note, baseline trinucleotide frequencies within preferences at any given codon may result from a entially occupy the translated region of the first
coding and noncoding sequence are largely in- TF binding element extant in some ancestral spe- coding exon (Fig. 3, A and C) (30, 31). These
dependent of one another (table S2). The fact that cies to human. Third, codon usage bias can be findings indicate that the translated portion of
the third position of preferred codons overlapping exaggerated because of mutual reinforcement with the first coding exon may serve functionally as
footprints is under excess evolutionary constraint other cellular factors such as tRNA abundances an extension of the canonical promoter.
(Fig. 2D and table S2) supports a general role for (23, 24). Indeed, such mechanisms could be linked More broadly, the repressor NRSF preferen-
TFs in potentiating codon usage biases through to codon biases created by exonic TF occupancy tially occupies and evolutionarily constrains se-
the selective preservation of preferred codons. through a feedback mechanism that potentiates quences coding for leucine-rich protein domains,
Although nearly all codon biases parallel TF intrinsic TF-imposed biases, resulting in both such as signal peptide and transmembrane do-
recognition preferences genome-wide, arginine— abundant and rare codons and associated tRNAs, mains (Fig. 3D and fig. S9, B and C). Also, TFs
one of the five amino acids encoded by codons differences that could in turn affect protein syn- such as CTCF and SREBP1 preferentially oc-
containing CpGs (four out of six codons)—was thesis and stability (25–27). cupy and constrain splice sites (fig. S10, A to D),
a notable exception. CpGs frequently occur in To analyze positional occupancy patterns of which are otherwise generally depleted of DNaseI
regulatory DNA (table S2) yet have an elevated specific TFs within coding sequence, we sys- footprints (fig. S10E). The above results suggest
mutational rate (22). Consequently, although TFs tematically matched TF recognition sequences that specific protein structural and splicing fea-
may favor CpG-containing codons (Fig. 2E) and with footprints, providing an accurate measure tures may undergo exaptation for specific regu-
impart excess constraint thereto (table S2), the of a TF’s in vivo occupancy (13, 28). This anal- latory purposes.
higher mutational rate at such codons is likely ysis revealed that a subset of TFs selectively We also found that the occupancy of specific
incompatible with preferential use. avoid coding sequences (Fig. 3A). Intriguingly, TFs within coding sequence parallels the extent
Codons outside footprints still exhibit usage TFs involved in positioning the transcriptional of CpG methylation at their binding site (fig. S11).
biases (Fig. 2A and table S2); however, it is likely preinitiation complex, such as NFYA and SP1 This raises the possibility that gene body methyl-
that these biases also reflect the actions of TFs. (29), preferentially avoid the translated region ation, which is paradoxically extensive at actively
First, our conclusions above are drawn from a of the first coding exon (Fig. 3A) and typically transcribed genes (32, 33), may provide a tunable
conservative and incomplete annotation of duons. occupy elements immediately upstream of the mechanism for thwarting opportunistic TF occu-
Second, because TF trinucleotide preferences and methionine start codon (Fig. 3B and fig. S9A). pancy within coding sequence during transcription.

Fig. 3. TFs exploit and avoid specific coding features. (A) Percentage of amino acid sequence within YY1 footprints overlapping start codons. (D)
TF motifs occupied in coding versus noncoding regions (P values, paired t test). (Top left and bottom) For NRSF as per (C). (Right, arrow) Protein domain
(B) Density of NFYA (left), AP2 (middle), and SP1 (right) footprints relative to annotation of first exon third-frame NRSF footprints versus SP1 footprints.
translated region of first coding exons. (C) (Top) Density of YY1 footprints across (E) TF preference (avoidance) of stop codon trinucleotides within versus out-
first coding exons. (Bottom) YY1 recognition sequence and corresponding side footprints in noncoding regions (P values, Pearson’s c2 test).

1370 13 DECEMBER 2013 VOL 342 SCIENCE www.sciencemag.org


REPORTS
If TFs, through selective recognition sequences, prints harbored heterozygous SNVs (Fig. 4A). polymorphisms in duons encompass both syn-
could impose changes in protein sequence, dele- Functional SNVs that disrupt TF occupancy quan- onymous (12%) and nonsynonymous (88%)
terious consequences could arise if such changes titatively skew the allelic origins of DNaseI cleav- substitutions (fig. S13A) and may directly affect
resulted in a nonsense substitution. We observed age fragments (13), and 17.4% of all heterozygous pathogenetic mechanisms (fig. S13, B to F, and
that TFs generally avoid stop codons (fig. S10E). coding SNVs within footprints showed this sig- table S3). As such, disease-associated variants
This finding extends to noncoding regions, in which nature (Fig. 4B and fig. S12), including both within duons may compromise both regulatory
stop codon trinucleotides (TAA, TAG, and TGA) synonymous and nonsynonymous variant classes and/or protein-structural functions. These find-
are selectively depleted within footprints. This in- (Fig. 4C). The potential of a coding SNV to dis- ings have substantial practical implications for
dicates that the global TF repertoire has been selec- rupt overlying TF occupancy was independent the interpretation of genetic variation in coding
tively purged of DNA binding domains capable of of the class of variant (Fig. 4D) or whether a regions.
recognizing—and thus preferentially stabilizing— nonsynonymous variant was predicted to be de- Our results indicate that simultaneous en-
nonsense codons (Fig. 3E and fig. S10F). leterious to protein function (Fig. 4, E and F). coding of amino acid and regulatory information
The high sequencing coverage provided by Of common disease- and trait-associated SNVs within exons is a major functional feature of com-
genomic footprinting revealed 592,867 hetero- identified by genome-wide associated studies plex genomes. The information architecture of
zygous single-nucleotide variants (SNVs) across (GWASs) in coding sequence (19), 13.5% fall with- the received genetic code is optimized for super-
the 81 cell type samples, and 3% of coding foot- in duons (fig. S13A). GWAS single-nucleotide imposition of additional information (34, 35),

Fig. 4. Genetic variation in duons frequently alters TF occupancy. (A) zygous (G/A) cells. (D) Proportion of synonymous and nonsynonymous variants
Proportion of coding footprints overlapping a SNV in any of 81 cell types. (B) in duons that allelically alter TF occupancy. (E and F) Proportion of nonsyn-
Proportion of SNVs in duons that allelically alter TF occupancy. (C) (Top) Per- onymous variants from (D) grouped by predicted impact of coding variant on
nucleotide DNaseI cleavage at common nonsynonymous G→A SNV (rs8110393) protein function using (E) SIFT or (F) Polyphen-2. None of the bins are signif-
in G/G and A/A homozygous cells. (Bottom) Allelic SP1 occupancy in hetero- icantly different (Fisher’s exact test; n.s. indicates P > 0.1).

www.sciencemag.org SCIENCE VOL 342 13 DECEMBER 2013 1371


REPORTS
and this intrinsic flexibility has been extensively 14. S. M. Hyder, Z. Nawaz, C. Chiappetta, K. Yokoyama, 33. D. Zilberman, M. Gehring, R. K. Tran, T. Ballinger,
exploited by natural selection. Although TF bind- G. M. Stancel, J. Biol. Chem. 270, 8506–8513 (1995). S. Henikoff, Nat. Genet. 39, 61–69 (2007).
15. G. Lang, W. M. Gombert, H. J. Gould, Immunology 114, 34. S. Itzkovitz, U. Alon, Genome Res. 17, 405–412
ing within exons may serve multiple functional 25–36 (2005). (2007).
roles, our analyses above is agnostic to these roles, 16. D. I. Ritter, Z. Dong, S. Guo, J. H. Chuang, PLOS ONE 7, 35. S. Itzkovitz, E. Hodis, E. Segal, Genome Res. 20,
which may be complex (36). e35202 (2012). 1582–1589 (2010).
17. A. H. Khan, A. Lin, D. J. Smith, PLOS ONE 7, e46098 36. T. R. Mercer et al., Nat. Genet., published online 23 June
(2012). 2013 (10.1038/ng.2677).
References and Notes
18. R. Y. Birnbaum et al., Genome Res. 22, 1059–1068
1. R. Grantham, C. Gautier, M. Gouy, R. Mercier, A. Pavé,
(2012). Acknowledgments: We thank many colleagues for their
Nucleic Acids Res. 8, r49–r62 (1980).
19. Materials and methods are available as supplementary insightful comments and critical readings of the manuscript.
2. T. Ikemura, J. Mol. Biol. 151, 389–409 (1981).
materials on Science Online. We also thank many colleagues who provided individual
3. R. Grantham, C. Gautier, M. Gouy, M. Jacobzone,
20. W.-H. Li, Molecular Evolution (Sinauer Associates, cell samples for DNaseI analysis. We also thank E. Rynes for
R. Mercier, Nucleic Acids Res. 9, r43–r74 (1981).
Sunderland, MA, 1997). his technical assistance. This work was supported by NIH grants
4. M. Gouy, C. Gautier, Nucleic Acids Res. 10, 7055–7074
21. W. Fu et al., Nature 493, 216–220 (2013). U54HG004592, U54HG007010, and U01ES01156 to J.A.S.
(1982).
22. C. Coulondre, J. H. Miller, P. J. Farabaugh, W. Gilbert, A.B.S. was supported by grant FDK095678A from the National
5. A. Eyre-Walker, M. Bulmer, Nucleic Acids Res. 21,
4599–4603 (1993). Nature 274, 775–780 (1978). Institute of Diabetes and Digestive and Kidney Diseases.
23. M. Bulmer, Nature 325, 728–730 (1987). J.M.A. is a paid consultant for Glenview Capital. All data from
6. D. B. Carlini, W. Stephan, Genetics 163, 239–243
24. M. Bulmer, Genetics 129, 897–907 (1991). this study are available through the ENCODE data repository at
(2003).
7. M. dos Reis, R. Savva, L. Wernisch, Nucleic Acids Res. 32, 25. J. Duan et al., Hum. Mol. Genet. 12, 205–216 (2003). UCSC (www.encodeproject.org) and the Roadmap Epigenomics
5036–5044 (2004). 26. J. zur Megede et al., J. Virol. 74, 2628–2635 (2000). data repository at NCBI (www.ncbi.nlm.nih.gov/epigenomics).
8. J. L. Parmley, J. V. Chamary, L. D. Hurst, Mol. Biol. Evol. 27. J. R. Coleman et al., Science 320, 1784–1787
(2008).
Supplementary Materials
23, 301–309 (2006). www.sciencemag.org/content/342/6164/1367/suppl/DC1
9. T. Warnecke, C. C. Weber, L. D. Hurst, Biochem. Soc. Trans. 28. R. M. Samstein et al., Cell 151, 153–166 (2012).
29. S. McKnight, R. Tjian, Cell 46, 795–805 (1986). Materials and Methods
37, 756–761 (2009). Figs. S1 to S13
10. W. Gu, T. Zhou, C. O. Wilke, PLOS Comput. Biol. 6, 30. C. Zhang et al., Nucleic Acids Res. 34, 2238–2246
(2006). Tables S1 to S3
e1000664 (2010). References (37–63)
11. M. F. Lin et al., Genome Res. 21, 1916–1928 (2011). 31. H. Xi et al., Genome Res. 17, 798–806 (2007).
12. Z. Yang, R. Nielsen, Mol. Biol. Evol. 25, 568–579 (2008). 32. A. Hellman, A. Chess, Science 315, 1141–1143 19 July 2013; accepted 23 October 2013
13. S. Neph et al., Nature 489, 83–90 (2012). (2007). 10.1126/science.1243490

environmental conditions, this property could be


Cryptic Variation in Morphological lost (“decanalization”), resulting in expression
of the cryptic variation on which selection could
Evolution: HSP90 as a Capacitor act (4).
More recently, Lindquist demonstrated that
for Loss of Eyes in Cavefish HSP90 (heat shock protein 90) provides a mo-
lecular mechanism for buffering genetic variation
and releasing it in response to environmental
Nicolas Rohner,1 Dan F. Jarosz,2* Johanna E. Kowalko,1 Masato Yoshizawa,3 William R. Jeffery,3,4 stress (5–10). The HSP90 chaperone assists in
Richard L. Borowsky,5 Susan Lindquist,2,6,7 Clifford J. Tabin1† the folding of proteins that are metastable signal
transducers, such as kinases, transcription factors,
In the process of morphological evolution, the extent to which cryptic, preexisting variation and ubiquitin ligases. HSP90 is normally present
provides a substrate for natural selection has been controversial. We provide evidence that heat at much higher concentrations than needed to
shock protein 90 (HSP90) phenotypically masks standing eye-size variation in surface populations maintain these proteins, allowing it to act as a
of the cavefish Astyanax mexicanus. This variation is exposed by HSP90 inhibition and can be buffer, protecting organisms from phenotypic con-
selected for, ultimately yielding a reduced-eye phenotype even in the presence of full HSP90 sequences that would otherwise be caused by
activity. Raising surface fish under conditions found in caves taxes the HSP90 system, unmasking genetic variants of these proteins. Because pro-
the same phenotypic variation as does direct inhibition of HSP90. These results suggest that tein folding is so sensitive to environmental stress,
cryptic variation played a role in the evolution of eye loss in cavefish and provide the first changes in the environment can exhaust the chap-
evidence for HSP90 as a capacitor for morphological evolution in a natural setting. erone buffer, unmasking vulnerable polymorphisms.
And because multiple variants can be unmasked
longstanding question in evolutionary Recent studies have indicated that both mecha- at the same time, this system provides a mecha-

A biology is the extent to which selection


acts on preexisting “standing variation”
in a population, as opposed to de novo mutations.
nisms have contributed to morphological evolu-
tion (1, 2). Thus, although de novo mutations may
exist and contribute to phenotypic evolution, re-
nism to create complex traits in a single step (11).
Besides changes in the activities of kinases,
phosphatases, transcription factors, and ubiquitin
peated use of standing variation has played an ligases, other distinct mechanisms have been re-
1
Department of Genetics, Harvard Medical School, Boston, MA important role in the evolution in these fish. How- ported by which changes in HSP90 function can
02115, USA. 2The Whitehead Institute for Biomedical Re- ever, these observations also raise a critical ques- lead to changes in phenotype (5, 10, 12–16)
search, Cambridge, MA 02142, USA. 3Department of Biology, tion: How is genetic variation maintained in a Evidence strongly suggests that this mecha-
University of Maryland, College Park, MD 20742, USA. 4Marine
Biological Laboratory, Woods Hole, MA 02543, USA. 5Depart-
population if it is not adaptive before new selec- nism has operated in microbial populations (7, 8),
ment of Biology, New York University, New York, NY 10003, tive conditions? but its relevance to the evolution of natural pop-
USA. 6Howard Hughes Medical Institute, Cambridge, MA 02142, Waddington proposed that developmental pro- ulations of higher organisms remains highly con-
USA. 7Department of Biology, Massachusetts Institute of Tech- cesses are quite robust and produce the same troversial. Thus far, examples of HSP90-mediated
nology, Cambridge, MA 02142, USA. phenotype regardless of minor genotypic varia- canalization in multicellular eukaryotes have been
*Present address: Departments of Chemical and Systems tion, a phenomenon he termed “canalization” (3). limited to lab strains of various model organisms.
Biology and Developmental Biology, School of Medicine,
Stanford University, Stanford, CA 94305, USA.
In such conditions, cryptic variation can accu- Moreover, with the exception of some pheno-
†Corresponding author. E-mail: [email protected]. mulate and can be maintained without conse- types in Arabidopsis, the phenotypes of HSP90-
edu quence. He further proposed that under certain released canalization in higher organisms are not

1372 13 DECEMBER 2013 VOL 342 SCIENCE www.sciencemag.org

You might also like