The Origins of Genome Complexity: Science December 2003

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/8996972

The Origins of Genome Complexity

Article  in  Science · December 2003


DOI: 10.1126/science.1089370 · Source: PubMed

CITATIONS READS
1,319 2,360

2 authors, including:

John S Conery
University of Oregon
58 PUBLICATIONS   7,505 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

C. elegans neural networks for chemotaxis View project

All content following this page was uploaded by John S Conery on 03 June 2014.

The user has requested enhancement of the downloaded file.


The Origins of Genome Complexity
Michael Lynch, et al.
Science 302, 1401 (2003);
DOI: 10.1126/science.1089370

The following resources related to this article are available online at


www.sciencemag.org (this information is current as of March 7, 2008 ):

Updated information and services, including high-resolution figures, can be found in the online
version of this article at:
http://www.sciencemag.org/cgi/content/full/302/5649/1401

Supporting Online Material can be found at:

Downloaded from www.sciencemag.org on March 7, 2008


http://www.sciencemag.org/cgi/content/full/302/5649/1401/DC1
A list of selected additional articles on the Science Web sites related to this article can be
found at:
http://www.sciencemag.org/cgi/content/full/302/5649/1401#related-content
This article cites 25 articles, 15 of which can be accessed for free:
http://www.sciencemag.org/cgi/content/full/302/5649/1401#otherarticles

This article has been cited by 185 article(s) on the ISI Web of Science.

This article has been cited by 74 articles hosted by HighWire Press; see:
http://www.sciencemag.org/cgi/content/full/302/5649/1401#otherarticles

This article appears in the following subject collections:


Genetics
http://www.sciencemag.org/cgi/collection/genetics

Information about obtaining reprints of this article or about obtaining permission to reproduce
this article in whole or in part can be found at:
http://www.sciencemag.org/about/permissions.dtl

Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the
American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. Copyright
2003 by the American Association for the Advancement of Science; all rights reserved. The title Science is a
registered trademark of AAAS.
REPORTS
17. Materials and methods are available as supporting 22. P. Liljelund, S. Sariotte, J. M. Buhler, A. Sentenac, Proc. 29. We thank A. Hopper and members of the Engelke lab
material on Science Online and are derived from Natl. Acad. Sci. U.S.A. 89, 9302 (1992). for experimental suggestions, and D. Thiele, E. Phis-
protocols at http://singerlab.aecom.yu.edu/protocols/ 23. M. R. Paule, R. J. White, Nucleic Acids Res. 28, 1283 icky, and J. Abelson for helpful comments on the
insitu_yeast.htm. (2000). manuscript. Supported by research grant GM63142
18. J. P. O’Connor, C. L. Peebles, Mol. Cell. Biol. 11, 425 24. P. L. Deininger, M. A. Batzer, Genome Res. 12, 1455 from NIH.
(1991). (2002). Supporting Online Material
19. S. Sarkar, A. K. Hopper, Mol. Biol. Cell 11, 3041 25. J. S. Smith, J. D. Boeke, Genes Dev. 11, 241 (1997). www.sciencemag.org/cgi/content/full/302/5649/1399/
(1998). 26. D. Donze, C. R. Adams, J. Rine, R. T. Kamakaka, Genes DC1
20. F. Hediger, F. R. Neumann, G. Van Houwe, K. Dev. 13, 698 (1999). Materials and Methods
Figs. S1 to S3
Dubrana, S. M. Gasser, Curr. Biol. 12, 2076 (2002). 27. D. Donze, R. T. Kamakaka, EMBO J. 20, 520 (2001).
References
21. J. M. Huibregtse, D. R. Engelke, Mol. Cell. Biol. 4, 3244 28. The Saccharomyces Genome Database is available at
(1989). http://genome-www.stanford.edu/Saccharomyces/. 30 July 2003; accepted 10 October 2003

The Origins of more evolutionarily relevant genetic effective


population size (Ne), which determines the

Genome Complexity degree to which gene frequencies are faith-


fully transmitted across generations. For ex-
ample, a large population can behave genet-
Michael Lynch1* and John S. Conery2 ically like a small one if a minor fraction of
individuals contribute to the reproductive

Downloaded from www.sciencemag.org on March 7, 2008


Complete genomic sequences from diverse phylogenetic lineages reveal notable pool or if beneficial chromosomal segments
increases in genome complexity from prokaryotes to multicellular eukaryotes. periodically sweep through the population.
The changes include gradual increases in gene number, resulting from the Insight into long-term effective population
retention of duplicate genes, and more abrupt increases in the abundance of sizes can be acquired from the nucleotide
spliceosomal introns and mobile genetic elements. We argue that many of these variation at silent sites in protein-coding
modifications emerged passively in response to the long-term population-size genes (i.e., sites at which a nucleotide substi-
reductions that accompanied increases in organism size. According to this tution leaves the encoded amino acid un-
model, much of the restructuring of eukaryotic genomes was initiated by changed). The rate of introduction of new
nonadaptive processes, and this in turn provided novel substrates for the variation per site in two randomly compared
secondary evolution of phenotypic complexity by natural selection. The enor- alleles is 2u (twice the mutation rate per
mous long-term effective population sizes of prokaryotes may impose a sub- nucleotide), whereas the expected rate of loss
stantial barrier to the evolution of complex genomes and morphologies. of variation from neutral sites is 1/(2Ne) in a
randomly mating diploid population. At equi-
The ⬃100 fully sequenced eubacterial and mobile elements in the human genome and librium, the average number of nucleotide
archaeal genomes contain between 350 and the massive increase in the average intron substitutions at neutral sites is 4Neu, with
6000 genes, packed into 0.6 to 7.6 megabases size in some multicellular eukaryotes have no slight modifications required for other modes
(Mb) (1). Whereas some unicellular eu- obvious advantages. Finally, given that some of inheritance (1). Thus, levels of silent-site
karyotes have genomes well within the range prokaryotes are capable of cell differentia- variation among random alleles within a spe-
of these prokaryotes (such as 2000 genes in tion, have linear chromosomes, and in rare cies provide an estimate of the composite
2.9 Mb for the parasitic microsporidian En- cases have nuclear membranes, it is unclear parameter Neu.
cephalitozoon cuniculi), all well-characterized whether the relatively simple genomes of mi- In a broad phylogenetic sense, there is
genomes of multicellular animals and plants crobes are merely reflections of unusual an inverse relationship between organism
contain more than 13,000 genes in at least physiological constraints. Any general theory size and Neu. Proceeding from top to bot-
100 Mb. The amount of DNA associated with of genomic architecture evolution must ac- tom of Fig. 1A, with two exceptions (Strep-
just 30 human genes is equivalent to the count for the peculiar molecular attributes of tococcus pyogenes and Pseudomonas
entire genome size of an average prokaryote. various genetic elements, in addition to being aeruginosa), all surveyed prokaryotes have
Accompanying the increase in gene number compatible with the principles of population Neu ⬎ 0.025, whereas, with the exception
in multicellular species is an expansion in the genetics. We argue here that the transitions of the malarial parasite Plasmodium falci-
size and number of intragenic spacers (in- from prokaryotes to unicellular eukaryotes to parum and the ciliate Tetrahymena ther-
trons) and a dramatic proliferation of mobile multicellular eukaryotes are associated with mophila, the physically larger unicellular
genetic elements. orders-of-magnitude reductions in population eukaryotes have 0.0035 ⬍ Neu ⬍ 0.025.
It remains unclear whether the expansions size; by magnifying the power of random For the still larger vascular plants and in-
of genome size and complexity during eu- genetic drift, reduced population size pro- vertebrates, 0.00077 ⬍ Neu ⬍ 0.0037,
karyotic evolution were essential for adaptive vides a permissive environment for the pro- whereas for vertebrates, 0.00027 ⬍ Neu ⬍
phenotypic diversification. After all, there are liferation of various genomic features that 0.0010. Ne can be disentangled from u by
many ways to generate multiple functions would otherwise be eliminated by purifying noting that the mutation rate per base per
from individual genes, such as tissue-specific selection. cell division ranges from 5 ⫻ 10⫺11 to 5 ⫻
gene regulation, alternative splicing, and Direct counts from multicellular and uni- 10⫺10, with an average value of ⬃2.3 ⫻
RNA editing. In addition, the millions of cellular eukaryotes consistently show an in- 10⫺10 (6 ). This implies that Ne is generally
verse relationship between population density greater than 108 for prokaryotes and often
per unit of area and average individual body in the range of 107 to 108 for unicellular
1
Department of Biology, Indiana University, Bloom-
ington, IN 47405, USA. 2Department of Computer
mass within a species (2–5). Such scaling eukaryotes. The number of germline cell
and Information Science, University of Oregon, Eu- need not reflect the pattern for total popula- divisions per generation is ⬃10 in nema-
gene, OR 97403, USA. tion size, given that it does not account for todes and ⬃25 in flies (6 ), implying that Ne
*To whom correspondence should be addressed. E- total species ranges. Moreover, the total is in the range of ⬃105 to 106 for inverte-
mail: [email protected] abundance of a species need not reflect the brates; the number of germline cell divi-

www.sciencemag.org SCIENCE VOL 302 21 NOVEMBER 2003 1401


REPORTS
sions in vertebrates is ⬃100 (6 ), implying However, the strong negative relationship be- L-shaped, suggestive of a steady-state sto-
that Ne is on the order of 104 to 105. tween genome size and estimates of recent chastic birth-death process, from which the
These results probably underestimate the Neu (Fig. 1B) is consistent with the idea that rate of birth and loss of duplicate genes can
disparity in Ne among unicellular and multi- these estimates also reflect longer term con- be estimated (1, 9, 10).
cellular species for two reasons. First, selec- ditions, with individual taxa experiencing Although fairly large standard errors are as-
tively driven codon bias can reduce silent-site temporal fluctuations around the predicted sociated with species-specific estimates, the av-
variation below the neutral expectation, and values. Moreover, the continuity of this rela- erage rates of gene duplication for unicellular
any such bias would be greater for large tionship between prokaryotes and eukaryotes and multicellular (metazoan) eukaryotes are not
populations, where selection is more efficient suggests that the cellular changes associated significantly different on the time scale of silent-
(7 ). Second, the majority of unicellular spe- with the prokaryote-eukaryote transition are site divergence (Table 1). Only downwardly bi-
cies in Fig. 1 are pathogens, and their genetic not major determinants of genome size ased estimates of the birth rates of prokaryotic
effective sizes may be highly influenced by and complexity. genes can be obtained (1), but the averages based
that of their larger host, as in the case of the The number of functioning genes within a on 73 taxa are still ⬃50% of the values for
human malarial parasite Plasmodium falcipa- genome reflects the long-term stochastic in- eukaryotes. Thus, over a wide phylogenetic
rum. Thus, although the preceding calcula- terplay between gene origin by various dupli- range, chromosomal events that result in gene
tions are approximations and the scaling of cation mechanisms and gene loss by muta- duplications appear to occur at rates that are
estimated Ne with actual population size is tional silencing, which must be reflected in roughly proportional to those of mutations caus-
less than linear (8), the power of random the smaller genomes of unicellular species ing nucleotide substitutions, perhaps because
genetic drift appears to vary by several orders relative to multicellular species. To estimate both types of events reflect activities dur-

Downloaded from www.sciencemag.org on March 7, 2008


of magnitude between the smallest unicellu- these rates, we have introduced evolutionary ing replication.
lar and largest multicellular species. demographic techniques that use the diver- In contrast, on the scale of silent-site di-
The above estimates apply only to the past gence of silent sites as a relative measure of vergence, duplicate genes are lost much more
⬃4Ne generations for each taxon, whereas the age of a duplicate pair (9, 10). The age slowly in multicellular than in unicellular
many of the gross features of genomes must distribution of all duplicates within a com- eukaryotes (Table 1), and there is a clear
have emerged over much longer time scales. pletely sequenced genome is typically tendency for the half-life of duplicate genes
to increase with genome size, again with a
continuous transition between prokaryotes
and eukaryotes (Fig. 2). Thus, by correlation,
the ability of a newly arisen gene to survive
the accumulation of mutations increases with
decreasing effective population size. Because
deleterious mutations are expected to accrue
more easily in small populations, this coun-
terintuitive result sheds some light on the
processes that may be influencing the longev-
ity of duplicate genes.
Preservation of both members of a dupli-
cate pair can be promoted when one member
of the pair acquires a beneficial mutation at
the expense of an original essential function
retained by the other (neofunctionalization)

Fig. 2. The average half-lives of duplicate


genes, defined as –ln(0.5)/d, in eukaryotic (solid
Fig. 1. (A) Estimates of the composite parameter Neu for a phylogenetically diverse assemblage of circles) and prokaryotic (open circles) species
species. (B) The relationship between estimated Neu, total gene number, and genome size. Data for on the time scale of divergence of silent sub-
prokaryotes are plotted in blue. The log-log regression of Neu versus genome size is highly stitutions. The log-log regression is significant
significant, with an intercept of –1.30 ⫾ 0.40, a slope of – 0.55 ⫾ 0.07, and r 2 ⫽ 0.659, df ⫽ 28 at the 5% level, with an intercept of –1.76 ⫾
(1). The number of species plotted differs between graphs because genome structure information 0.20, a slope of 0.20 ⫾ 0.05, and r 2 ⫽ 0.548,
is not available for all species with Neu estimates. df ⫽ 12 (1).

1402 21 NOVEMBER 2003 VOL 302 SCIENCE www.sciencemag.org


REPORTS
(11). Because degenerative mutations greatly karyotes is less than two. Only two spliceo- The rather abrupt increase in the average
outnumber beneficial mutations, the proba- somal introns have been found in the kineto- intron number per gene with increasing ge-
bility of preservation by rare neofunctional- plastid Trypanosoma (17 ), and only a single nome size is accompanied by a more contin-
izing mutations is diminished in small one has been found in the diplomonad Giar- uous increase in the average intron length
populations. In contrast, preservation by sub- dia (18). Understanding this uneven phyloge- (Fig. 3), which has been observed previously
functionalization occurs when both members netic distribution of introns is a major chal- in more phylogenetically restricted surveys
of a pair are partially degraded by mutations lenge for evolutionary genomics. (22, 23). The inverse scaling of the average
to the extent that their joint expression is Although natural selection may eventually intron length with Neu [slope of the logarith-
necessary to fulfill the essential functions of exploit introns for adaptive purposes (16), newly mic regression (⫾SEM) on Neu ⫽ – 0.67 ⫾
the ancestral locus (12, 13). The probability established introns are expected to impose a 0.22] is consistent with the hypothesis that
of subfunctionalization approaches zero in selective disadvantage (s) on their host genes by population-size reduction diminishes the ef-
large populations because the long time to increasing the mutation rate to defective alleles ficiency of selection against mildly deleteri-
fixation magnifies the chances that secondary (19). Theory suggests that there is a threshold ous insertions into introns. Within genomes,
mutations will completely incapacitate one value of Nes ⬇ 1.0, below which newly arisen the average intron size increases in regions of
copy before joint preservation is complete introns can freely drift to fixation and above low recombination (24, 25), which may also
and because of the weak mutational disad- which intron colonization and maintenance are be a consequence of localized reductions in
vantage of harboring two coding regions exceedingly improbable. Qualitatively consistent effective population size resulting from selec-
(14). The longer retention time of duplicate with this hypothesis is a threshold genome size tive sweeps and/or background selection (19,
genes in small populations is inconsistent of ⬃10 Mb, below which introns are very rare 25). An alternative hypothesis that intron size

Downloaded from www.sciencemag.org on March 7, 2008


with the predictions for the neofunctionaliza- and above which they approach an asymptote of acts as a recombination modifier to reduce
tion model and opposite to the expected pat- about seven per gene (Fig. 3). By transforming selective interference among linked sites (24)
tern if degenerative mutations only lead to scales from Fig. 1B, we found that the maximum is not easily reconciled with the reduction of
complete nonfunctionalization of duplicate value of Neu that is permissive to intron prolif- intron size and number in compact genomes.
genes (15), but it is entirely compatible with eration is ⬃0.015. How does this compare with Mobile genetic elements are self-contained
expectations under the subfunctionalization the theoretical expectation of Nes ⬇ 1.0? genomic units capable of proliferating within
model. Thus, although the evolution of mul- The minimum selective disadvantage of an their host genomes (26, 27). Hundreds of fam-
ticellularity undoubtedly posed some new se- allele that contains a new intron is about equal to ilies of these elements exist within eukaryotes,
lective challenges that were met through the excess-mutation rate to defective alleles and almost all of them fit into three major
neofunctionalization, much of the increase in caused by alterations at sites involved in splic- functional categories: DNA-based (cut-and-
gene number in multicellular species may not ing. The number of base positions (in the intron paste) transposons and the long-terminal repeat
have been driven by adaptive processes, but and surrounding exons) with nucleotide identi- (LTR) and non-LTR classes of RNA-dependent
rather as a passive response to a genetic ties that are essential for proper splicing is un- (copy-and-paste) retrotransposons. The vast
environment (reduced population size) more likely to be less than 10 and is plausibly as high majority of mobile elements are indiscriminate
conducive to duplicate-gene preservation as 30 (19). Thus, the net selective disadvantage with respect to insertion sites, and as a conse-
by subfunctionalization. of an intron-containing allele is at least 10 to 30 quence, their activities often have deleterious
Spliceosomal introns are noncoding times as large as u, not including insertion and effects on the host genome. A broad range of
stretches of RNA that are excised from the deletion mutations, which minimally occur at selection coefficients must be associated with
transcripts of their host protein-coding genes. ⬃10 to 60% of the rate of substitutions per base insertions in coding regions, regulatory regions,
The mechanisms by which introns originate (20, 21). Because they can alter the spatial con- and intergenic spacers, and because mutations
remain a mystery, but their broad phyloge- figuration of key splice-site signatures, the num- with negative fitness consequences ⬎⬎1/(2Ne)
netic distribution implies that they and the ber of insertion and deletion events affecting are efficiently eliminated by selection, the frac-
spliceosome that processes them were present proper splicing must exceed that for substitu- tion of mobile-element insertions capable of
in the stem eukaryote (16 ). The average num- tions. Thus, the observed threshold value of drifting to fixation must decline with increasing
ber of introns per gene in most multicellular Neu ⬇ 0.015 for intron proliferation is reason- Ne. Because mobile elements gradually acquire
species is between four and seven, whereas ably compatible with the theoretical Nes ⬇ 1.0 inactivating mutations, the long-term survival
the average number for most unicellular eu- threshold. of an element family requires the average au-

Table 1. Average rates of origin (B) and loss (d ) of


Fig. 3. The relationship between duplicate genes (⫾SEM). The former is defined as
average intron size (solid circles) the probability of a gene duplicating over the time
in base pairs (bp) and intron span required for a silent-site divergence of 1%.
number (open circles) and ge- The latter is the exponential rate of loss, such that
nome size. The regression for in- D ⫽ 1 – e –(0.01d ), or ⬃0.01d for small d, is the
tron size is highly significant, probability of loss by the time silent sites have
with an intercept of 1.41 ⫾ 0.36, diverged by 1%, where e is the base of the natural
a slope of 0.51 ⫾ 0.10, and r 2 ⫽ logarithm (1). The analyses are based on gene
0.641, df ⫽ 16 (1). families containing five or fewer members, and
species-specific estimates can be found in the
supporting online material (1).

Species B d

Unicellular 0.00405 ⫾ 0.00130 43.26 ⫾ 10.15


eukaryotes
Metazoan 0.00373 ⫾ 0.00073 17.80 ⫾ 2.52
species
Prokaryotes 0.00238 ⫾ 0.00038 –

www.sciencemag.org SCIENCE VOL 302 21 NOVEMBER 2003 1403


REPORTS
tonomous member to spawn at least one suc- Although the mechanisms responsible for the enormous proliferation of phylogenetically
cessful insertion in its lifetime. Therefore, there initial restructuring of eukaryotic genomes may well-distributed genomic data, including those
must be a critical value of Ne above which a have been nonadaptive in nature, this would not from unculturable organisms. The exceptional
mobile-element family cannot maintain itself preclude the secondary deployment of the re- species within lineages should provide ideal
within a host species. sultant genomic complexities in adaptive phe- substrate for testing the ideas outlined here. For
Consistent with theoretical expectations, all notypic evolution. For example, having colo- example, one general prediction is that carni-
three classes of mobile elements appear to have nized most protein-coding genes in some spe- vores should exhibit the genomic hallmarks of
a threshold genome size below which mobile ele- cies, introns sustained a reliable mechanism for population-size reduction compared with relat-
ments are unable to establish, an intermediate range alternative splicing (30), and in at least some ed herbivores. As a general rule, total biomass
in which only a fraction of species harbor them, lineages, they provide an orientation mecha- declines ⬃10% with increasing trophic level,
and an upper threshold (⬃100 Mb) above which all nism for the surveillance of defective mRNAs and because average body size increases at
species are infected (1) (Fig. 4). By extrapola- (16). In addition, by converting single genes higher levels in the food chain, total population
tion from the mutation rate cited above, the criti- with multiple functions into multiple genes with size must decline even more sharply, which is
cal effective population size above which a unicel- fewer functions, subfunctionalization provides consistent with the substantially lower esti-
lular eukaryote population appears to be immune to a mechanism for eliminating pleiotropic con- mates of Ne for carnivores than for herbivores
retrotransposon proliferation is ⬃7 ⫻ 107, where- straints on ancestral genes, thereby opening up derived from molecular surveys (8). If the the-
as for DNA-based transposons it is ⬃2 ⫻ 107. previously inaccessible evolutionary pathways. ory that we present is correct, and should free-
The influence of effective population size Because genomic infidelities associated living prokaryotes with sufficiently small long-
and mildly deleterious mutation on patterns of with DNA replication are likely to generate term Ne be found, we predict that they will

Downloaded from www.sciencemag.org on March 7, 2008


gene-sequence evolution has long been appre- observable genomic repatterning over a time harbor many of the same genomic changes that
ciated (28, 29), and the preceding results sug- scale that is on the order of tens of millions of we have described here for eukaryotes.
gest that these forces are central determinants of years, a judicious use of experiments provided
the types of genomic evolution that are permis- by nature will be necessary to test our hypoth- References and Notes
1. Materials and methods are available as supporting
sible in various phylogenetic lineages. If this esis further. Although there is a general tenden- material on Science Online.
hypothesis is correct, then many of the genomic cy for the genome sizes of multicellular species 2. P. E. Schmid, M. Tokeshi, J. M. Schmid-Araya, Science
attributes of multicellular organisms did not to exceed those of unicellular species, the range 289, 1557 (2000).
3. B. J. Enquist, K. J. Niklas, Nature 410, 655 (2001).
arise in direct response to selection for new cell in genome size can be up to three orders of 4. C. Carbone, J. L. Gittleman, Science 295, 2273 (2002).
types and functions but were indirect conse- magnitude among species with similar levels of 5. B. J. Finlay, Science 296, 1061 (2002).
quences of reduced effective population sizes cellular and developmental complexity (31). In 6. J. W. Drake, B. Charlesworth, D. Charlesworth, J. F.
that accompanied an increase in organism size. the very near future, we will experience an Crow, Genetics 148, 1667 (1998).
7. H. Akashi, Curr. Opin. Genet. Dev. 11, 660 (2001).
8. J. H. Gillespie, The Causes of Molecular Evolution
(Oxford Univ. Press, New York, 1991).
9. M. Lynch, J. S. Conery, Science 290, 1151 (2000).
10. M. Lynch, J. S. Conery, J. Struct. Funct. Genomics 3, 35
(2003).
11. S. Ohno, Evolution by Gene Duplication (Springer-
Verlag, Heidelberg, Germany, 1970).
12. A. Force et al., Genetics 151, 1531 (1999).
13. A. Stoltzfus, J. Mol. Evol. 49, 169 (1999).
14. M. Lynch, M. O’Hely, B. Walsh, A. Force, Genetics 159,
1789 (2001).
15. M. Lynch, A. Force, Genetics 154, 459 (2000).
16. M. Lynch, A. O. Richardson, Curr. Opin. Genet. Dev.
12, 701 (2002).
17. G. Mair et al., RNA 6, 163 (2000).
18. J. E. Nixon et al., Proc. Natl. Acad. Sci. U.S.A. 99,
3701 (2002).
19. M. Lynch, Proc. Natl. Acad. Sci. U.S.A. 99, 6118 (2002).
20. D. A. Petrov, D. L. Hartl, Mol. Biol. Evol. 15, 293 (1998).
21. D. R. Denver, K. Morris, M. Lynch, L. L. Vassilieva, W. K.
Thomas, Science 289, 2342 (2000).
22. A. E. Vinogradov, J. Mol. Evol. 49, 376 (1999).
23. M. Deutsch, M. Long, Nucleic Acids Res. 27, 3219 (1999).
24. J. M. Comeron, M. Kreitman, Genetics 156, 1175 (2000).
25. A. B. Carvalho, A. G. Clark, Nature 401, 344 (1999).
26. N. L. Craig, R. Craigie, M. Gellert, A. M. Lambowitz,
Eds., Mobile DNA II (Am. Soc. Microbiol. Press, Wash-
ington, DC, 2002).
27. B. Charlesworth, in Population Genetics and Molecu-
lar Evolution, T. Ohta, K. Aoki, Eds. ( Japan Sci. Soc.
Press, Tokyo, 1985), pp. 213–232.
28. T. Ohta, Nature 246, 96 (1973).
29. M. Kimura, The Neutral Theory of Molecular Evolution
(Cambridge Univ. Press, Cambridge, UK, 1983).
30. B. R. Graveley, Trends Genet. 17, 100 (2001).
31. T. R. Gregory, Biol. Rev. 76, 65 (2001).
32. Supported by grants from the NIH and the NSF (to
M.L.). We thank J. Gillespie, E. Koonin, and M. Wade
for helpful comments.
Supporting Online Material
www.sciencemag.org/cgi/content/full/302/5649/1401/DC1
SOM Text
Table S1
Fig. 4. Expansion of the three major classes of mobile genetic elements with genome size. Species for
which the elements are entirely absent are plotted on the x axis but not included in the regressions. 18 July 2003; accepted 8 October 2003

1404 21 NOVEMBER 2003 VOL 302 SCIENCE www.sciencemag.org


View publication stats

You might also like