MM Pe 291205

Download as pdf or txt
Download as pdf or txt
You are on page 1of 87

The neutral theory of molecular evolution

the neutral theory detecting natural selection exercises Objectives


1 - learn about the neutral theory 2 - be able to detect natural selection at the molecular level

Molecular evolution and Darwin


Two theses in Darwins Origin of Species 1 - organisms descend with modification from common ancestors

phylogenetics - pattern
2 - mechanism for this modification is natural selection

molecular basis of adaptation - process


The field of molecular evolution has been dominated by phylogenetics and molecular systematics. These endeavors have been extremely successful in supporting and elucidating the dynamics of point #1 above. Molecular evolutionists have been relatively less successful (Sharp 1997) at uncovering evidence detailing the mechanism of descent with modification - the molecular basis of adaptation. Recent years have seen tremendous strides in this area (Hughes 1999) but it remains to a great extent uncharted territory.

Population genetics
interested in genetic variation - understand generation and maintenance initially (until 1960s) only able to study indirectly - phenotype paucity of data led to controversy on the extent of genetic variation Classic school - very little genetic variation, cost associated with natural selection Muller Balance school - lots of genetic variation maintained by natural selection Dobzhansky

debate settled with advent of molecular approaches (direct) to the study of genetic variation - electrophoresis, sequencing tremendous amount of genetic variation exists thought was that this variation was maintained by natural selection accumulation of adaptively advantageous variants heterozygote advantage or heterosis

A new explanation neutralist for the high levels of molecular variation


Motoo Kimura (1968) Evolutionary rate at the molecular level. Nature 217: 624
Electrophoresis studies (Lewontin & Hubby) and sequence comparisons (Pauling & Zuckerkandl) reveal high levels of molecular variation Kimura reasoned that variation was too high and accumulated to rapidly to be explained by selection This is the so-called cost of selection argument (see appendix slide # 9) Conclude that these observed differences are selectively neutral that is they do not confer any selective advantage or disadvantage to the organisms that bear them (because they do not alter the function of the protein that they encode)

J.L. King & T.H. Jukes (1969) Non-Darwinian evolution. Science 164: 788
Note that many genetic changes have no effect on organismic fitness they are neutral Natural selection can not alter changes that it can not perceive Marshall biochemical evidence in support of these assertions e.g. synonymous substitutions, functionally equivalent cytochrome c variants, rapid evolution of fibrinopeptides (removed from functional fibrinogen)

The neutral theory


change too rapid to be explained by natural selection therefore most of the changes observed are selectively neutral no effect on fitness
Kimura reasoned that the majority of both polymorphism (allelic frequencies within populations) and substitution (fixed differences between populations) result from fixation of selectively neutral variants by random genetic drift - the main role of natural selection is elimination of deleterious variants (maintenance of the status quo) - molecular evolution is conservative - adaptively favorable mutations fixed by natural selection are a small minority of all nucleotide substitutions

huge debate ensued between selectionists (believe that extensive variation is a product of natural selection) and neutralists (believe that variation is a product of random fixation of neutral variants)

Predictions of the neutral theory


neutral theory makes explicit quantitative predictions about levels of genetic variation - null hypothesis of molecular evolution most important for our purposes: functionally important parts of a molecule will change more slowly than functionally unimportant parts
Those mutant substitutions that disrupt less the existing structure and function of a molecule (conservative substitutions) occur more frequently in evolution than more disruptive ones. Kimura and Ohta 1974

Absolutely essential concept in modern molecular biology: basis of


programs to align sequences and make functional predictions

Challenge to the Darwinian view: if selection is driving force in evolution, rate


of evolution should be most rapid where selection operates most - in the functionally important parts of molecules (opposite to the neutral view)

Maximum evolutionary rate & selection versus neutrality


do relative rates of change better fit selectionist or neutralist prediction? overwhelming support for neutralist prediction: 1 synonymous versus nonsynonymous subs rate (Kimura 1977, Jukes 1978) 2 accelerated rate of psuedogene evolution (Li et al 1981) synonymous subs - do not change encoded amino acid nonsynoymous subs - do change encoded amino acid

GAT AAC ATC CAA GGA ATA ACT GCA ATC GAC AAC ATC CAA GGT ATC ACG GCT ATC Asp Asn Ile Gln Gly Ile Thr Ala Ile
in virtually every gene ever studied synonymous sites change at a higher rate than nonsynonymous sites

Detection of natural selection using synonymous & non-synonymous substitution rates


Types of natural selection:
1 purifying (negative) selection removal of deleterious variants 2 diversifying (positive) selection fixation of adaptive variants

Types of substitution rates: (for protein coding genes i.e. codons)


1 synonymous substitution rate (Ks or ds) rate of substitution for DNA changes that do not change the encoded amino acids 2 non-synonymous substitution rate (Ka or dn) rate of substitution for DNA changes that do change the encoded amino acids

The relative levels for these rates indicate the mode of selection for a gene Neutral evolution (no selection): Purifying selection: Diversifying selection: Ks Ka Ks >> Ka Ks << Ka

Exercises
Compare synonymous and nonsynonymous substitution rates for: 1 the Drosophila alcohol dehydrogenase (Adh) gene dros-adh.meg 2 the human & mouse gene pair mammal.meg Determine the mode of selection acting on each based on these rates 1 - load alignment into DnaSP 2 - assign coding region 3 - calculate Ks and Ka and compare (which is higher) 4 - load alignment into Mega 5 - calculate ds and dn and compare (which is higher) 6 - do statistical test for difference between ds and dn (see appendix II slide # 11)

Kimuras derivation of the neutral theory


noticed variation too high and change too rapid to be explained by natural selection
- based on amino acid sequence data (hemoglobin & cytochrome c) Kimura calculated an average of 1 aa per 28 my in a 100 aa protein - this is too high for natural selection based on Haldanes concept of the cost of selection - if only individuals with high fitnesses for a number of diff traits survive, only a very small fraction of the population will remain e.g. moth melanism - 50% mortality due to bird predation - if simultaneous effects at 10 loci 1 / 210 (1 out of 1,024) survivors - population likely to go extinct before all 10 alleles fixed - Haldane calculated that a 1 new allele per 300 generations can be substituted - Kimura noted that in fact substitutions at the molecular level occurring much more rapidly 1 - 1 sub per 28my per 100 aa 2 - mammalian genome size 4 x 109 bp 3 - 100 aa = 300 bp and 20% nucleotide subs synonymous thus 1 aa sub 1.2 bp sub 4 - time it would take for nucleotide substitution to occur in the genome is: (28 x 106) / (4 x 109/300) / 1.2 = 1.8 years - this is a much higher rate of substitution than 1 every 300 generations

Statistical comparison of synonymous & non-synonymous substitution rates


Depending on the values observed one may wish to test the following hypotheses: ds > dn ds = dn ds < dn To do this use the normal deviate or Z test (see lecture 8 slide #6) Z = difference between ds & dn divided by the standard error of the difference difference D = abs (ds dn) Standard error of the difference sD = Z = D / sD

(se(ds)

+ se(dn)2)

formula in MSExcel =abs(ds-dn)/sqrt(se(ds)^2+se(dn)^2)

Then use t-table with infinite degrees of freedom to evaluate P value

You might also like