Academia.eduAcademia.edu

Evolution of anatomy and gene control

2008, Nature

The delightful Persian miniature of a composite elephant at the Aga Khan Trust for Culture in Geneva (Fig. 1) conveys such a concept. We understand that living things, such as the elephant, are complex and, fortunately, there is a bit more light around these days. ...

Vol 451|7 February 2008 Evolution of anatomy and gene control Georgy Koentges Evo-devo meets systems biology. We’ve probably all heard the story of the six wise men left in a darkened room with an elephant and asked to say what it was. Each felt a different piece of the elephant, compared HORIZONS it to something else and interpreted the whole as a simple extension of the part he described. There was no common observation ground, so nobody recognized that each description just captured some significant aspects of the same object1. The delightful Persian miniature of a composite elephant at the Aga Khan Trust for Culture in Geneva (Fig. 1) conveys such a concept. We understand that living things, such as the elephant, are complex and, fortunately, there is a bit more light around these days. Palaeontology, developmental biology, gene regulation and systems biology gather around the beast and use their tools to illuminate its parts. But we have to figure out how the parts relate to each other, and how we can communicate about them appropriately. Since Darwin we know that we must explain the elephant not only in mechanistic terms (of mutation, selection and adaptation on the population level) but also in historical terms, as ‘descent with modification’, evolution in phylogeny. Molecular changes hundreds of millions of years ago constrain the possibility of change here and now. Not everything is possible, and evolutionary history is as much a story of constraint as functionality. Leonardo’s ‘flying machines’ didn’t just fail because bodies of a human size and weight fall under physical scaling laws limiting how big muscles could become. The evolutionary history that led to our present body size also stops us acquiring wings, either now or any time soon. We know that we are constrained by genetic baggage, but the molecular causes have remained elusive. They vacillate between being neutral and adaptive at different times in our phylogenetic history2, so purely functional studies offer little prospect of finding them. Comparative approaches of molecular function can reveal them, as an elegant study on the darwinian evolution of ligand–receptor interactions has recently illustrated3. Because phylogeny is a concatenation of developmental 658 Figure 1 | Illuminating the elephant of complexity. This composite elephant is from a Persian painting from around 1600. (Courtesy of the Aga Khan Trust for Culture, Geneva.) processes in populations, all heritable morphological changes derive from developmental changes in molecular control hierarchies and networks4. The daunting task of the field known as evo-devo is to map structural diversity onto the underlying gene-regulatory diversity and dynamics. Developmental regulatory changes have affected patterning, differentiation and growth. Patterning describes the highly regulated, three-dimensional self-organization of groups of embryonic cells towards structures we can see. Differentiation is the allocation of cells to particular fates, such as muscles and bone. Differential growth implies that certain (molecularly defined) groups of cells grow more rapidly than other groups within an organism. Developmental genetics has shown that these three activities are often linked in complex ways but are separately controlled molecularly in space and time. The embryonic locations of such linkages are genetically defined cell lineages, where the molecular actors — the genetic control networks — must have changed their plots to create phenotypic diversity. Historians of life are interested in the specific succession of character changes as they happened over evolutionary time. By joining forces with mechanistic disciplines, they can learn how to read visible characters as epiphenomena of the most information-rich units within biological structures, and apply this knowledge to understand fossil anatomy. Such information-carrying units are not accessible through intuition: only genetic experiments allow us to see the elephant from the inside. Comparative genetic analysis tells us that the elephant consists of many parts that are also used by other organisms for both similar and different purposes, and that the differences between the parts’ connections contain valuable information. HORIZONS NATURE|Vol 451|7 February 2008 These ‘atoms’ of biological information are hard to measure. Each scale of organization requires different descriptors, and it is difficult to conceptualize how single-molecule dynamics on two strings of DNA (in a diploid organism) can cause major structural changes over historical timescales. Systems biology is starting to make it easier for those speaking the languages of DNA and mathematics to interact, under the auspices of massively parallel measurement platforms, comparative genomics, graphical models5 and dynamic systems theory6. Here I will outline how introducing historical information into the mechanistic fields of developmental biology, gene regulation and systems biology can stimulate useful new dialogues and exchange of expertise. These disciplines can teach — and constrain — each other about where and how to look at the unique features of living information-carriers and build common observation platforms. By applying genomic width, mechanistic precision and historical depth, such approaches will help us describe our ideas in less-intuitive but mathematically sound ways, in a language that machines can process. This may enable us to slowly make out the immensely rich historical contours of the elephant of complexity as it emerges from the darkness of time. Charting metazoan history Palaeontology is equipped with powerful statistical tools to reconstruct phylogenies, and a sophisticated armoury of non-invasive structural investigation techniques to trace the succession of structural changes as they happened during history. Phylogenetics allows us to reconstruct trees of ancestral relationships using the rules of parsimony (favouring the fewest evolutionary changes) and synapomorphy (by using unique character states shared by two groups assumed to be inherited from a common ancestor)7. Other characters, which are not used for phylogeny reconstruction, can then be charted onto these trees, revealing the direction of changes across vast historical timescales (Fig. 2). Phylogenies put extant species in the appropriate historical context and reveal which characters studied in live organisms are really ‘ancestral’ or have significance for a larger taxonomic group. Thus insights from molecular and developmental biology rely on a foundation established by morphological and palaeontological studies. Fossil studies are in turn starting to benefit from knowledge of the molecular players behind the characters. Palaeontology reveals combinations of fossilized characters that are no longer seen in extant organisms, but that shed light on the evolution of development and complexity. For example, the evolutionary emergence of the vertebrate skeleton was recently redefined in a profound and nonintuitive way8 (A in Fig. 2). An anatomist comparing living sharks with bony fishes would Observation Lamprey Inference Bony fish Shark Presence of bone Bone evolved at A Inclusion of fossil morphologies Bone evolved at B Shared embryonic and soft-tissue characters: Expression of patterning genes such as Hox genes in branchial arches Bone evolved at D A Lamprey Osteostracans Placoderms Shark Bony fish C B Lamprey Osteostracans Placoderms Shark Bony fish Cell population boundaries D Bone retained in most taxa crownwards of B Bone lost at C Leads to dating of character acquisitions and polarity Ground state for all extant and fossil taxa descending from stem group D Extant phylogenetic bracket Muscle patterns Figure 2 | Character evolution and gene expression in fossils. Soft-tissue and molecular characteristics of fossils can be inferred from their respective phylogenetic position by using the ‘extant phylogenetic bracket’9. conclude that a dermal bone cover is a feature unique to bony fishes. A palaeontologist, in contrast, discovers that the common ancestors of sharks and bony fishes had extensive, bony body armour (B in Fig. 2). In this light, the absence of such armour in sharks must be reinterpreted as a secondary loss (C in Fig. 2), with its presence in bony fish being the retention of an ancestral state. This purely palaeontological discovery has dramatic consequences for any line of enquiry that a developmental biologist who studies the molecular evolution of skeletogenesis might wish to initiate. A simple genetic comparison of skeletogenesis between extant sharks and bony fishes would not be meaningful, at least in terms of understanding the sequence of historical events that led towards having (or not having) a skeleton, as secondary bone loss might not involve the same molecules responsible for acquiring skeletons in the first place. Historical information is thus indispensable for refocusing the efforts of researchers who could otherwise go astray by trying to find molecular explanations for evolutionary processes that have not taken place but were incorrectly inferred from incomplete extant character distributions. Studies of character evolution are currently flourishing, thanks to new tools and discoveries. Beyond this, historians of life have a more fundamental agenda. They like to assign structural, cellular and molecular ‘ground states’ — combinations of old and new characters — to particular nodes of the phylogenetic tree. This is initially a sorting and classification exercise within robust phylogenies, but the idea behind it is more profound: mapping observable, phenotypic range onto an underlying map of inferred molecular diversity. A useful inference tool for this purpose is the ‘extant phylogenetic bracket’9 (Fig. 2, bottom). If a group of extinct forms can be ‘bracketed’ phylogenetically by two extant forms, the similarities between these extant forms are likely to be a common heritage of the entire monophyletic group (including the bracketed fossils) and can be used to define the direction of molecular changes among these fossils. Although fossils cannot be genotyped, their genotypic composition can be inferred from their precise position in phylogenetic trees with respect to extant forms that can be genotyped. This bracket criterion is applicable to any soft parts or molecular characters and can, for example, be used to infer cell-lineage fate maps or ancestral gene-regulatory networks in fossil forms. This powerful method has rarely been used because no databases or data formats yet exist that allow phylogenetic information to be collated in such a way as to allow all comparative genomic data to be assigned to specific phylogenetic nodes. Ideally, we could go into such a database, pick a particular phylogenetic tree position and obtain all the information inferred from phylogenetic bracket comparisons of entire genomes. This could then be correlated with morphologies, if such a database contained images of all the precious objects locked away in collection drawers. This would also institutionalize efforts to reconstruct ancient genomes and provide an appropriate comparative reference point for experimental analysis of the gene-regulatory circuitry. Surprisingly, fossils can help in charting the evolution of gene controls affecting tissues. 659 HORIZONS a NATURE|Vol 451|7 February 2008 b Figure 3 | Tissue preservation in fossils. a, Exquisitely preserved histology of muscle fibres with Z-banding and b, motor-axon endplates (arrows) in a 380-million-year-old placoderm. Scale bar: 20 μm. (From ref. 10, courtesy of the Royal Society.) Some fossils recently discovered exhibit pristine tissue preservation down to the submicrometre level, offering rich histological and cellular information about organisms that died hundreds of millions of years ago. For example, on some fossil placoderms — members of a wholly extinct group of fishes akin to the common ancestors of sharks and bony fishes — even the neuronal processes of motor-axon endplates on muscle-fibre Z-bands are still visible10 (Fig. 3, arrows). Palaeontologists have recently used synchrotron X-ray phase-contrast tomographic microscopy and image analysis to study threedimensional anatomy and histology non-invasively11. The richness of structural data that can thus be extracted from fossils and assigned to phylogenetic nodes is remarkable, and can, in some cases, match the subcellular scale of the analysis of mutant extant organisms. Fossils have the added benefit of a temporal perspective, as they potentially allow the precise dating of the underlying molecular pathways responsible for histogenetic processes and their evolution. If palaeontologists are to ‘read’ these historical phenomena appropriately, they will have to interact with developmental cell biologists well versed in the molecular pathways involved. Pinning the components of molecular pathways onto phylogenies is no trivial exercise, owing to the pleiotropic nature of gene and pathway action. The modularity of genetic traits or pathways encourages us to associate histological changes with regulatory phenomena accessible through comparative genomics. New databases that integrate images along with phylogenetic and molecular information would enable new synergies across disciplines. One would be able to investigate evolutionary novelties and convergent trends in a multidimensional search space. Structural 660 or molecular bottlenecks of robustness and system fragility12 that govern cellular state transitions in more than half a billion years of evolving metazoan ontogenies could become obvious, if those in a dimly lit room chose to join forces and build better lamps to illuminate their precious elephant. Single cells in lineages Cell lineage is the cause of functionally specialized cells in all multicellular organisms. It implies that molecular decisions made in a (mother) cell at a particular position in the developmental lineage are inherited by daughter cells, despite the diluting effects of cell division on gene-regulatory components. Mother cells of particular lineages make molecular choices that affect three key phenomena: patterning, differentiation and growth (Fig. 4). Patterning information is laid down as ‘address codes’ of positional identity in the embryo, where each cell’s (and its daughters’) responses are set in a collective manner and affect a wide variety of behaviours leading to three-dimensional shapes and connectivities (green in Fig. 4). Cell differentiation commits cells to a particular fate, irrespective of their positional information (yellow and purple in Fig. 4). Finally, the growth of particular lineages affects the elaboration of particular descending structures in size and shape through cell divisions. Patterning, differentiation and growth become apparent at different times, are laid down in the embryo along different spatiotemporal axes, and are implemented by different genetic regulators, often through complex combinatorial coding schemes. Sometimes these coding schemes have direct anatomical read-outs that palaeontologists can trace through history. At other times these key molecular units are transitory scaffolds that are removed after morphogenesis, in which case their presence and significance can be revealed only by genetic lineage analysis. In the past three decades, developmental genetics has repeatedly shown how our intuitive preconceptions about lineages and their distributions in adults can mislead us, whether we look at segments in insects13 or the segmental14 and non-segmental15 structures in the vertebrate brain, head16 and neck17. We do not know in advance which anatomical characters will reveal the most about the underlying cell lineages. Mapping them genetically and historically, across evolution, represents a formidable intellectual and technical challenge. Such genetic-fate mapping can extract informative morphological details that form the basis for a comparative analysis. When experimental embryology using cell transplantations reached its technical limits, genetic-fate mapping in transgenic organisms was made possible by using recombinase enzymes18 that can perform permanent genomic modifications in particular embryonic (and adult) lineages, rendered visible by genetic reporter activities17 (Fig. 4). This incredibly powerful tool allows us to link the earliest genetic lineage decisions directly with final morphology. Curiously, this technique has only rarely been used to address evolutionary questions. Evolutionary change in embryonic development can occur in either of two ways: if there are changes to the signals that a spatially and temporally invariant lineage in the embryo receives; or if the spatial extent and identity of a given lineage itself changes. Genetic lineage labelling provides an excellent way of discriminating between these possibilities. To perform such comparisons, homologous genes and their regulatory regions (which are indicative of particular homologous lineages) need to be used experimentally in a variety of phylogenetically relevant taxa. These must be of sufficient age and phylogenetic distribution that statements of homology can hold for all organisms concerned (blue in Fig. 4). The current resolution of attempts to label primary embryonic lineages barely goes beyond the tissue types recognized by classical embryologists. However, as every single descending cell of a given lineage is labelled, we can make strong inferences if these cells end up at interesting places and display interesting behaviours, with ‘class’ properties becoming measurable. Such work can reveal that patterning — for example in connectivity between structural elements — remains constant, whereas the patterns of differentiation and growth change16,17,19. However, fate-mapping more subtle molecular subdivisions in embryos, which can be studied in a sufficiently diverse number of taxa, is currently beyond us. Efficient transgenesis has been the most arduous obstacle to date, at least in vertebrates. The latest advances in transposon20, meganuclease21 and lentiviral 22 transgenesis promise to broaden the range of species that are amenable HORIZONS NATURE|Vol 451|7 February 2008 to genetic labelling with single-cell precision. It is in single cells belonging to genetically defined homologous lineages that we need to study homologous gene networks and regulatory components and their evolution (Fig. 4). Comparative expression and genomics of patterning genes Lamprey Osteostracans Placoderms Shark Mouse Patterning gene A Patterning gene A Evolution of gene regulation The central players of evolutionary change are likely to be elements of the gene-regulatory machinery, transcription factors and their cognate genomic binding regions, which are clustered in ‘cis-regulatory’ modules (CRMs) and promoters23. Little evidence has so far emerged for a role of chromatin or small RNAs in evolutionary changes of morphology, but this may be an artefact of observational bias. Ultimately, major morphological changes can be viewed as epiphenomena of dynamic changes in RNA–Pol II holoenzyme complexes (HECs) acting on regulatory gene nodes in key morphogenetic circuits (Fig. 5). This idea has been refined over the years in cogent discussions of ‘evolvability’24, but there are few specific examples in metazoans where we can assign major structural changes to specific gene-regulatory causes. Evolutionary ‘tinkering’ has occurred on many regulatory levels by the acquisition or loss of network components or changed functionalities. During the evolution of the wing gene regulatory network in various ant species, for example, the loss of key network components has occurred independently several times25. Existing networks can be co-opted for new patterning purposes at different places in the embryo. During the evolution of fins in early vertebrates, a pre-existing programme of nested Hox gene expression in the unpaired fins of agnathans was co-opted to a new embryonic location (the recently evolved paired fins) in the early jawed-vertebrate stem lineage26. Similarly, the evolution of promoters was implicated in the acquisition of new Hox expression domains in the mesoderm between agnathans and jawed vertebrates27. Regulatory networks can also be redeployed at similar places later in development, such as the insect appendicular patterning system that later becomes responsible for wingspot patterning in butterflies28,29. Occasionally, evolution can occur within the regulatory proteins themselves: the arthropod Ubx homeobox gene, for example, acquired a protein domain responsible for the loss of abdominal legs in insects, in contrast to its homologue in the crustacean-like ancestor30. Most of these phenomena were due to genes being turned on or off, or being present or absent. Improved transgenesis tools and comparative genomics will soon allow us to test regulatory network components functionally in a more quantitative manner. Patterning gene A Gene A Recombinase X Ubiquitous promoter STOP Reporter Ubiquitous promoter Reporter Transgenesis Homologous lineage and structures Species 1 Species 2 Differential three-dimensional morphology and fate allocations (structural novelties) Comparative analysis of gene-regulatory networks Gene and CRM orthologies Gene loss New FFL New arcs Regulatory loss Molecular causes of evolutionary continuity and change in gene regulation Figure 4 | Evolution, development and systems biology. These map the changes in information flow through biological systems across scales of organization. Phylogenetically conserved regulatory regions are used to map pertinent embryonic lineages. Expression profiling of those lineages allows the investigation and inference of the regulatory networks underlying their behaviour. The main technological problems are currently being solved, clearing the way for truly genomic systems analysis of morphological evolution that can go beyond examples of ephemeral beauty. Methods now exist for the transcriptome-wide profiling of single cells in particular lineages at specific places in embryos, using a combination of laser-capture microdissection, single-cell cDNA synthesis and microarray analysis31. In combination with genetic lineage labelling, these methods allow us to determine the minimal regulatory toolkit used by lineages at key morphogenetic stages. Although comparative cDNA measurements have provided fundamental insights into the evolution of beaks in Darwin’s finches32, for example, current array-based platforms are rather inflexible and costly. They are likely to be replaced by secondgeneration massively parallel sequencing technology33, which opens up transcriptome analysis to new species: entire cDNA libraries could be sequenced expeditiously and digital information about differential RNA abundance in homologous single cells obtained, irrespective of their species origin. These experimental platforms turn a previously insurmountable task into a computational mapping exercise that can take advantage of our ever-growing pool of sequence-homology information. Probabilistic methods5,34 can then use this information to infer network similarities (and differences) in homologous lineages — something already in our reach. To avoid bias for genes and pathways well-known to biologists, we need the independence of statistical network analysis. Automated network inference35 needs to be constrained by additional information, such as the presence of statistically significant transcription-factor binding sites on CRMs36. The speed of CRM discovery by comparative genomics has been accelerating ever since a larger number of phylogenetically interesting taxa were sequenced in their entirety 37. CRMs are being detected with ever-increasing coverage, and dramatic improvements in the detection of true CRMs with a conservation score below 75% are still to be made. CRMs determine the place and timing of gene action38. Current estimates for CRM numbers in vertebrates are in the few thousands, but this is probably a significant underestimate (G. K. and S. Ott, unpublished data). In CRMs, genes measure transcriptional inputs in ways we do not yet understand. CRMs are likely to act as logic functions39, coincidence detectors, filters, gradient sensors and resistors, all of which ultimately influence the kinetics of activators, repressors, SRB/mediator complexes and Pol II-HECs, grouped in ‘factories’40. Because HECs have not changed much structurally, the major structural source of 661 HORIZONS NATURE|Vol 451|7 February 2008 Cis-regulatory module (CRM) functional architecture Distal enhancer Transcription factors Enhancer SRB/ mediator Core promoter RNA–Pol II-holoenzyme complex Signalling pathways Gene regulation Evolution 1. Genomic information flow New regulators and genomic binding sites (Automated network inference)* Co-option of genes into old network motifs Evolution of feed-forward loops and feed-back loops 2. Spatial patterns Transcriptional sensors, filters for morphogens and signalling pathways, logic functions (Gaussian, boolean functions)* Size and structure of morphogenetic fields Shape changes of their descending structures 3. Temporal patterns Changes in binding affinities and abundance, Pol II promoter docking and escape, noise propagation and reduction (ODE/SDE modelling, statistical mechanics)* Speed of ontogeny, larval stage timing, heterochrony *Systems biology Figure 5 | Gene regulation as a driver of morphological change. The mechanistic core of gene control in evolving embryos and structures is shown. evolutionary novelty is likely to be found in the composition and action of the CRMs themselves. To begin with, these need to be classified bioinformatically. We also need to develop more39 statistical methods that can take into account the phylogenetic conservation of the order of binding sites within CRMs, to highlight the core architecture of ‘enhanceosome complexes’. For now, it is not clear whether ‘enhanceosomes’41 (where the order of binding sites matters due to synergistic, orientationspecific transcription-factor binding) or ‘billboard enhancers’42 (where the order of transcription-factor binding sites matters little) are the more frequent; this would shed light on our ideas of evolutionary novelty, as temporal changes are known to affect spatial expression patterns. The most famous example is the structural co-linearity of Hox genes and their temporal and axial deployment in vertebrates43,44. Any functional classification needs to capture spatial and temporal aspects of gene activation in relation to these CRMs. This is tedious and technically difficult to do as it requires the combinatorial cloning of CRMs and slow transgenic assays in cells or organisms. Fast transgenesis protocols are now available for invertebrates such as arthropods, sea-urchins45 and sea-squirts46. Recent studies on vertebrates suggest that only a fraction of ‘ultra-conserved’ CRMs are active and absolutely required for the animal to survive47. As CRMs were assayed at only a few embryonic time points, the absence of regulatory information cannot be construed as a lack of function. It is not clear when, and in which cell types, organ-system specific CRMs are expected to be active, so inferences from such studies should be treated with great caution. We have no idea what compensatory mechanisms will kick in to obviate the regulatory bottlenecks when CRMs are experimentally removed. Even the genomic removal of expected key CRMs belonging to MyoD, a master regulator of myogenesis, had only a moderate effect on the timing of gene 662 expression48, suggesting that the CRMs might be interacting in complex ways. Some CRMs might modulate the activities of others, so their effects might not be apparent unless their activity is assayed in a combinatorial fashion over time. We need to develop combinatorial strategies involving more subtle loss- and gainof-function testing of CRMs if we are to reveal such regulatory logic comprehensively and comparatively. Genomic systems biology Transcriptional regulatory networks can be subdivided according to the way information flows49. By analysing recurring network motifs of gene regulation among living species in a phylogenetic framework (functional testing has not been done for metazoans), we can compare their ‘computational function’ and trace their evolution. How did certain transcriptional inputs at one point of the network lead to certain outputs (numbers of RNA molecules) at another point, and how did this change? In insects, certain upstream ‘selector genes’ not only activate batteries of intermediate targets but also act directly on most downstream genes that encode structural components such as photoreceptors or pigmentation enzymes, much as the chief executive of a company talks to the secretary50. This constitutes a ‘feed-forward loop’. In many cases, such as the Ubx- or Pax6-dependent51,52 networks that underlie wing and eye development and evolution, the communication between chief executive and secretary was the starting point, with the feed-forward loop developing later. When did this happen, how often and why? Studies of prokaryotes have suggested that feed-forward loops have specific computational functions49, but these need to be explored functionally in metazoans and in a comparative manner. Until this happens, talk about ‘co-option’ or ‘homology of regulatory cassettes’ is premature. Expectations about ‘master regulators’ (of processes such as gastrulation, neural-crest formation, skeletogenesis or eye formation) may have to be reassessed; the chief executive might leave and a new one join while the company remains unchanged, or the other way round. In vertebrates the expression domains of most patterning ‘master regulators’ have not changed significantly since agnathan times, despite obvious signs of morphological evolution. This does not mean that targets or motif dynamics have remained unchanged. We do not have the data to see static (let alone dynamic) differences in network motifs in a comparative manner. Motif deployment and speed will play key roles as developmental systems biologists try to observe these generegulatory circuits in action (Fig. 4). Although information about spatial regulation can be obtained from whole embryos, temporal regulation through CRMs is better studied in judiciously chosen single-cell assays. Advances in massively parallel imaging and tracking assays could yield sufficient temporal resolution for modellers to survey a host of functional input–output functions, a question at the cutting edge of genomic systems biology. Such assays will allow us to test CRMs of different species and check whether temporal regulation might have evolved and, if so, whether it can be ascribed to CRMs directly, as opposed to transcription factors or the speed of the activator, repressor or RNA–Pol II HECs occupying them. Such speed differences in homologous network motifs will provide valuable insights into the causes of evolutionary heterochrony if they are focused on those genes deemed relevant in a dialogue between development and palaeontology. When evolutionary biologists have inferred the composition of ancestral CRM sequences and transcription factors, we can use mathematical tools of systems analysis to formulate the dynamics of their action, capturing both deterministic and stochastic effects53. The dynamics at each gene network node are based on a few molecular complexes sitting on, and sliding along, two molecules of DNA (in a diploid organism), probably in non-equilibrium states. They are therefore subject to significant stochastic, mesoscale effects, and it is rather surprising to see any deterministic behaviour at all in patterning over developmental and evolutionary timescales. Determinism causes robust morphological outcomes and enables us to establish anatomical homologies in the first place, so surely wellconfined stochasticity needs to be the source of change. If the highly conserved regulatory apparatus that acts on a fundamentally stochastic background is expected to be central for spatio-temporal regulation, it is surprising to see considerable differences in the size of embryos (and morphogenetic fields). Noise tolerance53 for key regulators is likely to differ, depending on whether it is expressed by a thousand cells or a single cell, or is dependent on the organism’s ploidy status. Bigger morphogenetic HORIZONS NATURE|Vol 451|7 February 2008 fields might be expected to afford less accuracy, but do they? When were key noise constraints laid down in history, and how big were the morphogenetic fields in the embryos of ancestors of major phylogenetic groups? Until we can measure the noise of homologous network nodes in a variety of species, these systems questions will remain unanswered. Interspecific swaps of network components, assayed in live single cells or organisms, will help us find the basis of evolutionary novelty. Differential ontogenetic speed is a likely source of experimental difficulty, as indicated by recent transgenic studies combining Hox-gene CRMs and promoters from different species54. Recent advances in massively parallel synthetic DNA chemistry55, currently used to construct microorganismal genomes from scratch, will one day allow us to synthesize thousands of CRMs belonging to ancestral genomes directly. This is not Jurassic Park (yet), although advances in genome engineering may one day let us test a plethora of such ancestral CRMs simultaneously and functionally in living cells and organisms. Such assays will be necessary to assign evolutionary fitness functions to specific CRMs and transcriptionfactor binding sites within them56, and to help us formulate intragenomic constraints, such as competition of different CRMs for RNA–Pol II factories, which are currently below the radar of genomic systems biology and its models. Stochasticity in evolution By establishing common measurement and analysis platforms, palaeontology, developmental biology, gene regulation and systems biology can each make unique contributions towards explaining the history of life (Fig. 5). Palaeontologists can reconstruct phylogenetic trees, determine structural and molecular ‘ground states’ at key tree nodes, and discover evolutionary directionalities. This allows them to work out what needs to be explained mechanistically. Developmental biologists, armed with their genetic toolkits, can identify the smallest units of information within development as lineages whose alterations lead to the observed anatomical changes, and refocus the historian’s attention onto the most revealing anatomical details. Making homologous lineages the reference point for studies of gene regulation will allow us to identify shared regulatory network nodes and motifs49 and to measure their dynamics in vivo and in vitro (Fig. 4). With the help of synthetic DNA chemistry55, inferred historical genomic information can be tested functionally for its effects on gene regulation in live cells and organisms. Systems-biology studies of network sensitivity and noise propagation57 can then help to formulate the dynamics of deterministic and stochastic components53. The latter will identify ‘hotspots’ of systems’ flexibility that are relevant to morphological evolution. Direct experimental manipulation of such hotspots might then enable us to ‘replay’ key evolutionary processes within the lifetime of single experimental organisms. There might be some initial disappointment that nature neither constructed its regulatory circuits with an engineer’s intelligence nor used Occam’s razor, whereas we must use both to describe it. Historians can help us decipher when and how new information was introduced, and into which old (and imperfect) information networks. As information networks are corridors of constraints, they depend on the states of their predecessors, subject to modification by stochastic forces. By placing the robustness and fragility12 of regulatory systems in a historical context, palaeontologists can identify phases of ontogenetic ‘experimentation’ on a larger phylogenetic timescale, at the base of the major metazoan radiations, and identify the key players and the (changing) rules in deep time. The links between evolution and systems biology are tenuous at the moment because of limitations in what we can measure. An elegant recent study about tooth patterning that links the mathematical modelling of signalling pathways with the comparative analysis of evolutionary diversity points in such a new direction58. Only a few metazoan systems are known in which noise is harnessed to establish phenotypic diversity (for example, in Drosophila photoreceptor choice59) or to capacitate evolutionary change (such as Hsp90)2. This shall not discourage us. Powerful tools will be available to assess the effect of a few regulatory molecules in non-equilibrium states60 on whole-organism structure and evolution. We can now see more than Darwin could ever have imagined. Comparing species genetically helps us see similarities and differences on each organizational level. The analytical tools are almost in place to integrate data sets into an edifice of human knowledge that transcends our inherited intuitive myopia. More genome detectives should engage with the historians of life in an effort to connect the dots and enter the quest for the historical outlines of the elephant, rich and strange. ■ Georgy Koentges is experimental co-director of the Warwick Systems Biology Centre, University of Warwick, Coventry CV4 7AL, UK. e-mail: [email protected] 1. Rumi, Mawlana Jalaluddin The Elephant in the Dark Room (13th cent.) in Mathnawi-yi ma’nawi (Masnavi) III 1259–1268 (ed. Nicholson, R. A.) (London 1925–1940). 2. Rutherford, S. L. & Lindquist, S. Nature 396, 336–342 (1998). 3. Bridgham, J. T., Carroll, S. M. & Thornton, J. W. Science 312, 97–101 (2006). 4. Wray, G. A. et al. Mol. Biol. Evol. 20, 1377–1419 (2003). 5. Pearl, J. Causality: Models, Reasoning, and Inference (Cambridge Univ. Press, 2000). 6. Guido, N. J. et al. Nature 439, 856–860 (2006). 7. Raff, R. A. Nature Rev. Genet. 8, 911–920 (2007). 8. Janvier, P. Early Vertebrates (Oxford Univ. Press, 1996). 9. Witmer, L. M. in Functional Morphology in Vertebrate Paleontology (ed. Thomason, J. J. ) 19–33 (Cambridge Univ. Press, New York, 1995). 10. Trinajstic, K., Marshall, C., Long, J. & Bifield, K. Biol. Lett. 3, 197–200 (2007). 11. Donoghue, P. C. et al. Nature 442, 680–683 (2006). 12. Stelling, J., Sauer, U., Szallasi, Z., Doyle, F. J. III & Doyle, J. Cell 118, 675–685 (2004). 13. Martinez-Arias, A. & Lawrence, P. A. Nature 313, 639–642 (1985). 14. Lumsden, A. & Keynes, R. Nature 337, 424–428 (1989). 15. Larsen, C. W., Zeltser, L. M. & Lumsden, A. J. Neurosci. 21, 4699–4711 (2001). 16. Koentges, G. & Lumsden, A. Development 122, 3229–3242 (1996). 17. Matsuoka, T. et al. Nature 436, 347–355 (2005). 18. Gu, H., Marth, J. D., Orban, P. C., Mossmann, H. & Rajewsky, K. Science 265, 103–106 (1994). 19. Huang, R., Zhi, Q., Patel, K., Wilting, J. & Christ, B. Development 127, 3789–3794 (2000). 20. Balciunas, D. et al. PLoS Genet. 2, e169 (2006). 21. Ogino, H., McConnell, W. B. & Grainger, R. M. Nature Protocols 1, 1703–1710 (2006). 22. Lois, C., Hong, E. J., Pease, S., Brown, E. J. & Baltimore, D. Science 295, 868–872 (2002). 23. Davidson, E. H. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution (Academic Press/ Elsevier, San Diego, 2006). 24. Kirschner, M. & Gerhart, J. Proc. Natl Acad. Sci. USA 95, 8420–8427 (1998). 25. Abouheif, E. & Wray, G. A. Science 297, 249–252 (2002). 26. Freitas, R., Zhang, G. & Cohn, M. J. Nature 442, 1033–1037 (2006). 27. Carr, J. L., Shashikant, C. S., Bailey, W. J. & Ruddle, F. H. J. Exp. Zool. 280, 73–85 (1998). 28. Keys, D. N. et al. Science 283, 532–534 (1999). 29. Prud’homme, B., Gompel, N. & Carroll, S. B. Proc. Natl Acad. Sci. USA 104 (suppl. 1), 8605–8612 (2007). 30. Ronshaugen, M., McGinnis, N. & McGinnis, W. Nature 415, 914–917 (2002). 31. Tietjen, I. et al. Neuron 38, 161–175 (2003). 32. Abzhanov, A. et al. Nature 442, 563–567 (2006). 33. Hafner, M. et al. Methods 44, 3–12 (2008). 34. Friedman, N. Science 303, 799–805 (2004). 35. Pournara, I. & Wernisch, L. Bioinformatics 20, 2934–2942 (2004). 36. Hansen, A., Ott, S. & Koentges, G. Genome Inform. 15, 141–150 (2004). 37. Stark, A. et al. Nature 450, 219–232 (2007). 38. Istrail, S. & Davidson, E. H. Proc. Natl Acad. Sci. USA 102, 4954–4959 (2005). 39. Tsong, A. E., Tuch, B. B., Li, H. & Johnson, A. D. Nature 443, 415–420 (2006). 40. Faro-Trindade, I. & Cook, P. R. Biochem. Soc. Trans. 34, 1133–1137 (2006). 41. Maniatis, T. et al. Cold Spring Harb. Symp. Quant. Biol. 63, 609–620 (1998). 42. Arnosti, D. N. & Kulkarni, M. M. J. Cell. Biochem. 94, 890–898 (2005). 43. Smith, J., Theodoris, C. & Davidson, E. H. Science 318, 794–797 (2007). 44. Kmita, M. & Duboule, D. Science 301, 331–333 (2003). 45. Damle, S., Hanser, B., Davidson, E. H. & Fraser, S. E. Dev. Biol. 299, 543–550 (2006). 46. Roure, A. et al. PLoS One 2, e916 (2007). 47. Pennacchio, L. A. et al. Nature 444, 499–502 (2006). 48. Chen, J. C., Ramachandran, R. & Goldhamer, D. J. Dev. Biol. 245, 213–223 (2002). 49. Alon, U. An Introduction to Systems Biology: Design Principles of Biological Circuits (Chapman & Hall, 2006). 50. Akam, M. Curr. Biol. 8, R676–R678 (1998). 51. Weatherbee, S. D. et al. Curr. Biol. 9, 109–115 (1999). 52. Gehring, W. J. Zoology 104, 171–183 (2001). 53. Blake, W. J., Kærn, M., Cantor, C. R. & Collins, J. J. Nature 422, 633–637 (2003). 54. Beckers, J., Gérard, M. & Duboule, D. Dev. Biol. 180, 543–553 (1996). 55. Tian, J. et al. Nature 432, 1050–1054 (2004). 56. Hittinger, C. T. & Carroll, S. B. Nature 449, 677–681 (2007). 57. Raser, J. M. & O’Shea, E. K. Science 309, 2010–2013 (2005). 58. Kavanagh, K. D., Evans, A. R. & Jernvall, J. Nature 449, 427–432 (2007). 59. Wernet, M. F. et al. Nature 440, 174–180 (2006). 60. Nicodemi, M. & Prisco, A. Phys. Rev. Lett. 98, 108104 (2007). Acknowledgements I thank Keith Vance, Sascha Ott and Per Ahlberg for discussions. I gratefully acknowledge advice from Sheila Canby (British Museum) about Persian miniatures and Rumi. Benoît Junod, curator at the Aga Khan Trust for Culture in Geneva, provided the photograph of the elephant. I apologize to my colleagues for not being able to cite all relevant publications. Work in my laboratory is funded by the Wellcome Trust, the Human Frontier Science Program, the ARC, ConquerChiari and the BBSRC. 663 View publication stats