Vol 451|7 February 2008
Evolution of anatomy and gene control
Georgy Koentges
Evo-devo meets systems biology.
We’ve probably all heard
the story of the six wise
men left in a darkened
room with an elephant and
asked to say what it was.
Each felt a different piece
of the elephant, compared
HORIZONS
it to something else and
interpreted the whole as
a simple extension of the part he described.
There was no common observation ground,
so nobody recognized that each description
just captured some significant aspects of the
same object1.
The delightful Persian miniature of a composite elephant at the Aga Khan Trust for Culture in Geneva (Fig. 1) conveys such a concept.
We understand that living things, such as the
elephant, are complex and, fortunately, there
is a bit more light around these days. Palaeontology, developmental biology, gene regulation and systems biology gather around the
beast and use their tools to illuminate its parts.
But we have to figure out how the parts relate
to each other, and how we can communicate
about them appropriately.
Since Darwin we know that we must explain
the elephant not only in mechanistic terms (of
mutation, selection and adaptation on the
population level) but also in historical terms,
as ‘descent with modification’, evolution in
phylogeny. Molecular changes hundreds of
millions of years ago constrain the possibility of change here and now. Not everything is
possible, and evolutionary history is as much a
story of constraint as functionality. Leonardo’s
‘flying machines’ didn’t just fail because bodies
of a human size and weight fall under physical
scaling laws limiting how big muscles could
become. The evolutionary history that led to
our present body size also stops us acquiring
wings, either now or any time soon.
We know that we are constrained by genetic
baggage, but the molecular causes have
remained elusive. They vacillate between
being neutral and adaptive at different times
in our phylogenetic history2, so purely functional studies offer little prospect of finding
them. Comparative approaches of molecular
function can reveal them, as an elegant study
on the darwinian evolution of ligand–receptor
interactions has recently illustrated3. Because
phylogeny is a concatenation of developmental
658
Figure 1 | Illuminating the elephant of complexity. This composite elephant is from a Persian painting
from around 1600. (Courtesy of the Aga Khan Trust for Culture, Geneva.)
processes in populations, all heritable morphological changes derive from developmental changes in molecular control hierarchies
and networks4. The daunting task of the field
known as evo-devo is to map structural diversity onto the underlying gene-regulatory diversity and dynamics.
Developmental regulatory changes have
affected patterning, differentiation and
growth. Patterning describes the highly regulated, three-dimensional self-organization of
groups of embryonic cells towards structures
we can see. Differentiation is the allocation of
cells to particular fates, such as muscles and
bone. Differential growth implies that certain
(molecularly defined) groups of cells grow
more rapidly than other groups within an
organism. Developmental genetics has shown
that these three activities are often linked in
complex ways but are separately controlled
molecularly in space and time.
The embryonic locations of such linkages
are genetically defined cell lineages, where
the molecular actors — the genetic control
networks — must have changed their plots to
create phenotypic diversity. Historians of life
are interested in the specific succession of character changes as they happened over evolutionary time. By joining forces with mechanistic
disciplines, they can learn how to read visible
characters as epiphenomena of the most information-rich units within biological structures,
and apply this knowledge to understand fossil
anatomy. Such information-carrying units are
not accessible through intuition: only genetic
experiments allow us to see the elephant from
the inside. Comparative genetic analysis tells us
that the elephant consists of many parts that are
also used by other organisms for both similar
and different purposes, and that the differences
between the parts’ connections contain valuable information.
HORIZONS
NATURE|Vol 451|7 February 2008
These ‘atoms’ of biological information are
hard to measure. Each scale of organization
requires different descriptors, and it is difficult to conceptualize how single-molecule
dynamics on two strings of DNA (in a diploid
organism) can cause major structural changes
over historical timescales. Systems biology is
starting to make it easier for those speaking
the languages of DNA and mathematics to
interact, under the auspices of massively parallel measurement platforms, comparative
genomics, graphical models5 and dynamic
systems theory6.
Here I will outline how introducing historical information into the mechanistic fields of
developmental biology, gene regulation and
systems biology can stimulate useful new
dialogues and exchange of expertise. These
disciplines can teach — and constrain — each
other about where and how to look at the
unique features of living information-carriers
and build common observation platforms. By
applying genomic width, mechanistic precision and historical depth, such approaches
will help us describe our ideas in less-intuitive
but mathematically sound ways, in a language
that machines can process. This may enable us
to slowly make out the immensely rich historical contours of the elephant of complexity as it
emerges from the darkness of time.
Charting metazoan history
Palaeontology is equipped with powerful
statistical tools to reconstruct phylogenies,
and a sophisticated armoury of non-invasive
structural investigation techniques to trace the
succession of structural changes as they happened during history. Phylogenetics allows us
to reconstruct trees of ancestral relationships
using the rules of parsimony (favouring the
fewest evolutionary changes) and synapomorphy (by using unique character states shared
by two groups assumed to be inherited from a
common ancestor)7. Other characters, which
are not used for phylogeny reconstruction,
can then be charted onto these trees, revealing
the direction of changes across vast historical
timescales (Fig. 2).
Phylogenies put extant species in the appropriate historical context and reveal which
characters studied in live organisms are really
‘ancestral’ or have significance for a larger taxonomic group. Thus insights from molecular
and developmental biology rely on a foundation established by morphological and palaeontological studies.
Fossil studies are in turn starting to benefit from knowledge of the molecular players
behind the characters. Palaeontology reveals
combinations of fossilized characters that are
no longer seen in extant organisms, but that
shed light on the evolution of development
and complexity. For example, the evolutionary emergence of the vertebrate skeleton was
recently redefined in a profound and nonintuitive way8 (A in Fig. 2). An anatomist comparing living sharks with bony fishes would
Observation
Lamprey
Inference
Bony fish
Shark
Presence of bone
Bone evolved at A
Inclusion of fossil
morphologies
Bone evolved at B
Shared embryonic and
soft-tissue characters:
Expression of patterning
genes such as Hox genes
in branchial arches
Bone evolved at D
A
Lamprey
Osteostracans
Placoderms Shark
Bony fish
C
B
Lamprey
Osteostracans
Placoderms Shark
Bony fish
Cell population
boundaries
D
Bone retained in most
taxa crownwards of B
Bone lost at C
Leads to dating of
character acquisitions
and polarity
Ground state for all
extant and fossil taxa
descending from stem
group D
Extant phylogenetic
bracket
Muscle patterns
Figure 2 | Character evolution and gene expression in fossils. Soft-tissue and molecular
characteristics of fossils can be inferred from their respective phylogenetic position by using
the ‘extant phylogenetic bracket’9.
conclude that a dermal bone cover is a feature
unique to bony fishes. A palaeontologist, in
contrast, discovers that the common ancestors of sharks and bony fishes had extensive,
bony body armour (B in Fig. 2). In this light,
the absence of such armour in sharks must be
reinterpreted as a secondary loss (C in Fig. 2),
with its presence in bony fish being the retention of an ancestral state.
This purely palaeontological discovery has
dramatic consequences for any line of enquiry
that a developmental biologist who studies the
molecular evolution of skeletogenesis might
wish to initiate. A simple genetic comparison
of skeletogenesis between extant sharks and
bony fishes would not be meaningful, at least
in terms of understanding the sequence of
historical events that led towards having (or
not having) a skeleton, as secondary bone loss
might not involve the same molecules responsible for acquiring skeletons in the first place.
Historical information is thus indispensable
for refocusing the efforts of researchers who
could otherwise go astray by trying to find
molecular explanations for evolutionary
processes that have not taken place but were
incorrectly inferred from incomplete extant
character distributions.
Studies of character evolution are currently
flourishing, thanks to new tools and discoveries. Beyond this, historians of life have a more
fundamental agenda. They like to assign structural, cellular and molecular ‘ground states’ —
combinations of old and new characters — to
particular nodes of the phylogenetic tree. This
is initially a sorting and classification exercise
within robust phylogenies, but the idea behind
it is more profound: mapping observable,
phenotypic range onto an underlying map of
inferred molecular diversity. A useful inference
tool for this purpose is the ‘extant phylogenetic
bracket’9 (Fig. 2, bottom). If a group of extinct
forms can be ‘bracketed’ phylogenetically by
two extant forms, the similarities between these
extant forms are likely to be a common heritage of the entire monophyletic group (including the bracketed fossils) and can be used to
define the direction of molecular changes
among these fossils. Although fossils cannot
be genotyped, their genotypic composition
can be inferred from their precise position in
phylogenetic trees with respect to extant forms
that can be genotyped.
This bracket criterion is applicable to any
soft parts or molecular characters and can, for
example, be used to infer cell-lineage fate maps
or ancestral gene-regulatory networks in fossil
forms. This powerful method has rarely been
used because no databases or data formats yet
exist that allow phylogenetic information to
be collated in such a way as to allow all comparative genomic data to be assigned to specific
phylogenetic nodes. Ideally, we could go into
such a database, pick a particular phylogenetic
tree position and obtain all the information
inferred from phylogenetic bracket comparisons of entire genomes. This could then be correlated with morphologies, if such a database
contained images of all the precious objects
locked away in collection drawers. This would
also institutionalize efforts to reconstruct
ancient genomes and provide an appropriate
comparative reference point for experimental
analysis of the gene-regulatory circuitry.
Surprisingly, fossils can help in charting the
evolution of gene controls affecting tissues.
659
HORIZONS
a
NATURE|Vol 451|7 February 2008
b
Figure 3 | Tissue preservation in fossils. a, Exquisitely preserved histology of muscle fibres with
Z-banding and b, motor-axon endplates (arrows) in a 380-million-year-old placoderm.
Scale bar: 20 μm. (From ref. 10, courtesy of the Royal Society.)
Some fossils recently discovered exhibit
pristine tissue preservation down to the submicrometre level, offering rich histological and
cellular information about organisms that died
hundreds of millions of years ago. For example, on some fossil placoderms — members
of a wholly extinct group of fishes akin to the
common ancestors of sharks and bony fishes
— even the neuronal processes of motor-axon
endplates on muscle-fibre Z-bands are still
visible10 (Fig. 3, arrows).
Palaeontologists have recently used synchrotron X-ray phase-contrast tomographic
microscopy and image analysis to study threedimensional anatomy and histology non-invasively11. The richness of structural data that can
thus be extracted from fossils and assigned to
phylogenetic nodes is remarkable, and can,
in some cases, match the subcellular scale of
the analysis of mutant extant organisms. Fossils have the added benefit of a temporal perspective, as they potentially allow the precise
dating of the underlying molecular pathways
responsible for histogenetic processes and their
evolution.
If palaeontologists are to ‘read’ these historical phenomena appropriately, they will have
to interact with developmental cell biologists
well versed in the molecular pathways involved.
Pinning the components of molecular pathways
onto phylogenies is no trivial exercise, owing
to the pleiotropic nature of gene and pathway
action. The modularity of genetic traits or pathways encourages us to associate histological
changes with regulatory phenomena accessible
through comparative genomics.
New databases that integrate images along
with phylogenetic and molecular information would enable new synergies across disciplines. One would be able to investigate
evolutionary novelties and convergent trends
in a multidimensional search space. Structural
660
or molecular bottlenecks of robustness and
system fragility12 that govern cellular state
transitions in more than half a billion years of
evolving metazoan ontogenies could become
obvious, if those in a dimly lit room chose to
join forces and build better lamps to illuminate
their precious elephant.
Single cells in lineages
Cell lineage is the cause of functionally specialized cells in all multicellular organisms.
It implies that molecular decisions made in
a (mother) cell at a particular position in the
developmental lineage are inherited by daughter cells, despite the diluting effects of cell division on gene-regulatory components.
Mother cells of particular lineages make
molecular choices that affect three key phenomena: patterning, differentiation and
growth (Fig. 4). Patterning information is laid
down as ‘address codes’ of positional identity
in the embryo, where each cell’s (and its daughters’) responses are set in a collective manner
and affect a wide variety of behaviours leading
to three-dimensional shapes and connectivities
(green in Fig. 4). Cell differentiation commits
cells to a particular fate, irrespective of their
positional information (yellow and purple in
Fig. 4). Finally, the growth of particular lineages
affects the elaboration of particular descending
structures in size and shape through cell divisions.
Patterning, differentiation and growth
become apparent at different times, are laid
down in the embryo along different spatiotemporal axes, and are implemented by different genetic regulators, often through complex
combinatorial coding schemes. Sometimes
these coding schemes have direct anatomical read-outs that palaeontologists can trace
through history. At other times these key
molecular units are transitory scaffolds that are
removed after morphogenesis, in which case
their presence and significance can be revealed
only by genetic lineage analysis.
In the past three decades, developmental
genetics has repeatedly shown how our intuitive preconceptions about lineages and their
distributions in adults can mislead us, whether
we look at segments in insects13 or the segmental14 and non-segmental15 structures in the
vertebrate brain, head16 and neck17. We do not
know in advance which anatomical characters
will reveal the most about the underlying cell
lineages. Mapping them genetically and historically, across evolution, represents a formidable
intellectual and technical challenge.
Such genetic-fate mapping can extract
informative morphological details that form
the basis for a comparative analysis. When
experimental embryology using cell transplantations reached its technical limits, genetic-fate
mapping in transgenic organisms was made
possible by using recombinase enzymes18
that can perform permanent genomic modifications in particular embryonic (and adult)
lineages, rendered visible by genetic reporter
activities17 (Fig. 4). This incredibly powerful
tool allows us to link the earliest genetic lineage
decisions directly with final morphology.
Curiously, this technique has only rarely
been used to address evolutionary questions.
Evolutionary change in embryonic development can occur in either of two ways: if there
are changes to the signals that a spatially and
temporally invariant lineage in the embryo
receives; or if the spatial extent and identity
of a given lineage itself changes. Genetic lineage labelling provides an excellent way of
discriminating between these possibilities.
To perform such comparisons, homologous
genes and their regulatory regions (which are
indicative of particular homologous lineages)
need to be used experimentally in a variety of
phylogenetically relevant taxa. These must be
of sufficient age and phylogenetic distribution
that statements of homology can hold for all
organisms concerned (blue in Fig. 4).
The current resolution of attempts to label
primary embryonic lineages barely goes beyond
the tissue types recognized by classical embryologists. However, as every single descending
cell of a given lineage is labelled, we can make
strong inferences if these cells end up at interesting places and display interesting behaviours,
with ‘class’ properties becoming measurable.
Such work can reveal that patterning — for
example in connectivity between structural
elements — remains constant, whereas the patterns of differentiation and growth change16,17,19.
However, fate-mapping more subtle molecular
subdivisions in embryos, which can be studied
in a sufficiently diverse number of taxa, is currently beyond us.
Efficient transgenesis has been the most arduous obstacle to date, at least in vertebrates. The
latest advances in transposon20, meganuclease21
and lentiviral 22 transgenesis promise to
broaden the range of species that are amenable
HORIZONS
NATURE|Vol 451|7 February 2008
to genetic labelling with single-cell precision. It is in single cells belonging to genetically defined homologous lineages that we
need to study homologous gene networks
and regulatory components and their evolution (Fig. 4).
Comparative expression and genomics
of patterning genes
Lamprey
Osteostracans
Placoderms
Shark
Mouse
Patterning
gene A
Patterning
gene A
Evolution of gene regulation
The central players of evolutionary change
are likely to be elements of the gene-regulatory machinery, transcription factors and
their cognate genomic binding regions,
which are clustered in ‘cis-regulatory’
modules (CRMs) and promoters23. Little
evidence has so far emerged for a role of
chromatin or small RNAs in evolutionary changes of morphology, but this may
be an artefact of observational bias. Ultimately, major morphological changes can
be viewed as epiphenomena of dynamic
changes in RNA–Pol II holoenzyme complexes (HECs) acting on regulatory gene
nodes in key morphogenetic circuits
(Fig. 5). This idea has been refined over
the years in cogent discussions of ‘evolvability’24, but there are few specific examples
in metazoans where we can assign major
structural changes to specific gene-regulatory causes.
Evolutionary ‘tinkering’ has occurred on
many regulatory levels by the acquisition
or loss of network components or changed
functionalities. During the evolution of the
wing gene regulatory network in various
ant species, for example, the loss of key network components has occurred independently several times25. Existing networks can
be co-opted for new patterning purposes
at different places in the embryo. During
the evolution of fins in early vertebrates,
a pre-existing programme of nested Hox
gene expression in the unpaired fins of
agnathans was co-opted to a new embryonic location (the recently evolved paired
fins) in the early jawed-vertebrate stem
lineage26. Similarly, the evolution of promoters was implicated in the acquisition
of new Hox expression domains in the
mesoderm between agnathans and jawed
vertebrates27.
Regulatory networks can also be redeployed at similar places later in development,
such as the insect appendicular patterning
system that later becomes responsible for
wingspot patterning in butterflies28,29. Occasionally, evolution can occur within the regulatory proteins themselves: the arthropod
Ubx homeobox gene, for example, acquired
a protein domain responsible for the loss of
abdominal legs in insects, in contrast to its
homologue in the crustacean-like ancestor30.
Most of these phenomena were due to genes
being turned on or off, or being present or
absent. Improved transgenesis tools and
comparative genomics will soon allow us to
test regulatory network components functionally in a more quantitative manner.
Patterning
gene A
Gene A
Recombinase
X
Ubiquitous promoter
STOP
Reporter
Ubiquitous promoter
Reporter
Transgenesis
Homologous lineage
and structures
Species 1
Species 2
Differential three-dimensional morphology
and fate allocations (structural novelties)
Comparative analysis of gene-regulatory networks
Gene and CRM orthologies
Gene
loss
New
FFL
New
arcs
Regulatory loss
Molecular causes of evolutionary
continuity and change in gene regulation
Figure 4 | Evolution, development and systems biology.
These map the changes in information flow through
biological systems across scales of organization.
Phylogenetically conserved regulatory regions are used to
map pertinent embryonic lineages. Expression profiling
of those lineages allows the investigation and inference of
the regulatory networks underlying their behaviour.
The main technological problems are
currently being solved, clearing the way for
truly genomic systems analysis of morphological evolution that can go beyond examples of ephemeral beauty. Methods now
exist for the transcriptome-wide profiling
of single cells in particular lineages at specific places in embryos, using a combination
of laser-capture microdissection, single-cell
cDNA synthesis and microarray analysis31.
In combination with genetic lineage labelling, these methods allow us to determine
the minimal regulatory toolkit used by lineages at key morphogenetic stages.
Although comparative cDNA measurements have provided fundamental insights
into the evolution of beaks in Darwin’s
finches32, for example, current array-based
platforms are rather inflexible and costly.
They are likely to be replaced by secondgeneration massively parallel sequencing
technology33, which opens up transcriptome
analysis to new species: entire cDNA libraries
could be sequenced expeditiously and digital
information about differential RNA abundance in homologous single cells obtained,
irrespective of their species origin.
These experimental platforms turn
a previously insurmountable task into
a computational mapping exercise that
can take advantage of our ever-growing
pool of sequence-homology information.
Probabilistic methods5,34 can then use this
information to infer network similarities
(and differences) in homologous lineages
— something already in our reach. To avoid
bias for genes and pathways well-known to
biologists, we need the independence of
statistical network analysis. Automated network inference35 needs to be constrained by
additional information, such as the presence of statistically significant transcription-factor binding sites on CRMs36.
The speed of CRM discovery by comparative genomics has been accelerating ever
since a larger number of phylogenetically
interesting taxa were sequenced in their
entirety 37. CRMs are being detected with
ever-increasing coverage, and dramatic
improvements in the detection of true
CRMs with a conservation score below 75%
are still to be made.
CRMs determine the place and timing of gene action38. Current estimates for
CRM numbers in vertebrates are in the few
thousands, but this is probably a significant
underestimate (G. K. and S. Ott, unpublished data). In CRMs, genes measure transcriptional inputs in ways we do not yet
understand. CRMs are likely to act as logic
functions39, coincidence detectors, filters,
gradient sensors and resistors, all of which
ultimately influence the kinetics of activators, repressors, SRB/mediator complexes
and Pol II-HECs, grouped in ‘factories’40.
Because HECs have not changed much
structurally, the major structural source of
661
HORIZONS
NATURE|Vol 451|7 February 2008
Cis-regulatory module (CRM)
functional architecture
Distal
enhancer
Transcription
factors
Enhancer
SRB/
mediator
Core promoter
RNA–Pol II-holoenzyme
complex
Signalling pathways
Gene regulation
Evolution
1. Genomic information flow
New regulators and genomic
binding sites
(Automated network inference)*
Co-option of genes into
old network motifs
Evolution of feed-forward
loops and feed-back loops
2. Spatial patterns
Transcriptional sensors, filters
for morphogens and signalling
pathways, logic functions
(Gaussian, boolean functions)*
Size and structure of
morphogenetic fields
Shape changes of their
descending structures
3. Temporal patterns
Changes in binding affinities
and abundance, Pol II promoter
docking and escape, noise
propagation and reduction
(ODE/SDE modelling,
statistical mechanics)*
Speed of ontogeny, larval
stage timing, heterochrony
*Systems biology
Figure 5 | Gene regulation as a driver of morphological change. The mechanistic core of gene control
in evolving embryos and structures is shown.
evolutionary novelty is likely to be found in the
composition and action of the CRMs themselves. To begin with, these need to be classified bioinformatically. We also need to develop
more39 statistical methods that can take into
account the phylogenetic conservation of the
order of binding sites within CRMs, to highlight the core architecture of ‘enhanceosome
complexes’. For now, it is not clear whether
‘enhanceosomes’41 (where the order of binding
sites matters due to synergistic, orientationspecific transcription-factor binding) or ‘billboard enhancers’42 (where the order of transcription-factor binding sites matters little) are the
more frequent; this would shed light on our ideas
of evolutionary novelty, as temporal changes
are known to affect spatial expression patterns.
The most famous example is the structural
co-linearity of Hox genes and their temporal
and axial deployment in vertebrates43,44.
Any functional classification needs to
capture spatial and temporal aspects of gene
activation in relation to these CRMs. This is
tedious and technically difficult to do as it
requires the combinatorial cloning of CRMs
and slow transgenic assays in cells or organisms. Fast transgenesis protocols are now
available for invertebrates such as arthropods,
sea-urchins45 and sea-squirts46. Recent studies
on vertebrates suggest that only a fraction of
‘ultra-conserved’ CRMs are active and absolutely required for the animal to survive47. As
CRMs were assayed at only a few embryonic
time points, the absence of regulatory information cannot be construed as a lack of function.
It is not clear when, and in which cell types,
organ-system specific CRMs are expected to be
active, so inferences from such studies should
be treated with great caution. We have no idea
what compensatory mechanisms will kick in to
obviate the regulatory bottlenecks when CRMs
are experimentally removed. Even the genomic
removal of expected key CRMs belonging to
MyoD, a master regulator of myogenesis, had
only a moderate effect on the timing of gene
662
expression48, suggesting that the CRMs might
be interacting in complex ways. Some CRMs
might modulate the activities of others, so
their effects might not be apparent unless their
activity is assayed in a combinatorial fashion
over time. We need to develop combinatorial
strategies involving more subtle loss- and gainof-function testing of CRMs if we are to reveal
such regulatory logic comprehensively and
comparatively.
Genomic systems biology
Transcriptional regulatory networks can be
subdivided according to the way information flows49. By analysing recurring network
motifs of gene regulation among living species in a phylogenetic framework (functional
testing has not been done for metazoans), we
can compare their ‘computational function’
and trace their evolution. How did certain
transcriptional inputs at one point of the network lead to certain outputs (numbers of RNA
molecules) at another point, and how did this
change? In insects, certain upstream ‘selector genes’ not only activate batteries of intermediate targets but also act directly on most
downstream genes that encode structural components such as photoreceptors or pigmentation enzymes, much as the chief executive of a
company talks to the secretary50.
This constitutes a ‘feed-forward loop’. In
many cases, such as the Ubx- or Pax6-dependent51,52 networks that underlie wing and eye
development and evolution, the communication between chief executive and secretary was
the starting point, with the feed-forward loop
developing later. When did this happen, how
often and why? Studies of prokaryotes have
suggested that feed-forward loops have specific
computational functions49, but these need to
be explored functionally in metazoans and in
a comparative manner.
Until this happens, talk about ‘co-option’ or
‘homology of regulatory cassettes’ is premature. Expectations about ‘master regulators’ (of
processes such as gastrulation, neural-crest
formation, skeletogenesis or eye formation) may
have to be reassessed; the chief executive might
leave and a new one join while the company
remains unchanged, or the other way round.
In vertebrates the expression domains of most
patterning ‘master regulators’ have not changed
significantly since agnathan times, despite
obvious signs of morphological evolution.
This does not mean that targets or motif
dynamics have remained unchanged. We
do not have the data to see static (let alone
dynamic) differences in network motifs in a
comparative manner. Motif deployment and
speed will play key roles as developmental
systems biologists try to observe these generegulatory circuits in action (Fig. 4).
Although information about spatial regulation can be obtained from whole embryos,
temporal regulation through CRMs is better
studied in judiciously chosen single-cell assays.
Advances in massively parallel imaging and
tracking assays could yield sufficient temporal resolution for modellers to survey a host of
functional input–output functions, a question
at the cutting edge of genomic systems biology.
Such assays will allow us to test CRMs of different species and check whether temporal regulation might have evolved and, if so, whether it
can be ascribed to CRMs directly, as opposed to
transcription factors or the speed of the activator, repressor or RNA–Pol II HECs occupying
them. Such speed differences in homologous
network motifs will provide valuable insights
into the causes of evolutionary heterochrony
if they are focused on those genes deemed relevant in a dialogue between development and
palaeontology.
When evolutionary biologists have inferred
the composition of ancestral CRM sequences
and transcription factors, we can use mathematical tools of systems analysis to formulate
the dynamics of their action, capturing both
deterministic and stochastic effects53. The
dynamics at each gene network node are based
on a few molecular complexes sitting on, and
sliding along, two molecules of DNA (in a diploid organism), probably in non-equilibrium
states. They are therefore subject to significant
stochastic, mesoscale effects, and it is rather
surprising to see any deterministic behaviour
at all in patterning over developmental and
evolutionary timescales.
Determinism causes robust morphological
outcomes and enables us to establish anatomical homologies in the first place, so surely wellconfined stochasticity needs to be the source
of change. If the highly conserved regulatory
apparatus that acts on a fundamentally stochastic background is expected to be central for
spatio-temporal regulation, it is surprising
to see considerable differences in the size of
embryos (and morphogenetic fields). Noise
tolerance53 for key regulators is likely to differ,
depending on whether it is expressed by a thousand cells or a single cell, or is dependent on the
organism’s ploidy status. Bigger morphogenetic
HORIZONS
NATURE|Vol 451|7 February 2008
fields might be expected to afford less accuracy,
but do they? When were key noise constraints
laid down in history, and how big were the morphogenetic fields in the embryos of ancestors of
major phylogenetic groups?
Until we can measure the noise of homologous network nodes in a variety of species, these
systems questions will remain unanswered.
Interspecific swaps of network components,
assayed in live single cells or organisms, will
help us find the basis of evolutionary novelty.
Differential ontogenetic speed is a likely source
of experimental difficulty, as indicated by recent
transgenic studies combining Hox-gene CRMs
and promoters from different species54.
Recent advances in massively parallel synthetic DNA chemistry55, currently used to
construct microorganismal genomes from
scratch, will one day allow us to synthesize
thousands of CRMs belonging to ancestral
genomes directly. This is not Jurassic Park
(yet), although advances in genome engineering may one day let us test a plethora of such
ancestral CRMs simultaneously and functionally in living cells and organisms. Such assays
will be necessary to assign evolutionary fitness
functions to specific CRMs and transcriptionfactor binding sites within them56, and to help
us formulate intragenomic constraints, such as
competition of different CRMs for RNA–Pol II
factories, which are currently below the radar
of genomic systems biology and its models.
Stochasticity in evolution
By establishing common measurement and
analysis platforms, palaeontology, developmental biology, gene regulation and systems
biology can each make unique contributions
towards explaining the history of life (Fig. 5).
Palaeontologists can reconstruct phylogenetic
trees, determine structural and molecular
‘ground states’ at key tree nodes, and discover
evolutionary directionalities. This allows them
to work out what needs to be explained mechanistically. Developmental biologists, armed
with their genetic toolkits, can identify the
smallest units of information within development as lineages whose alterations lead to the
observed anatomical changes, and refocus the
historian’s attention onto the most revealing
anatomical details.
Making homologous lineages the reference
point for studies of gene regulation will allow
us to identify shared regulatory network nodes
and motifs49 and to measure their dynamics
in vivo and in vitro (Fig. 4). With the help of
synthetic DNA chemistry55, inferred historical
genomic information can be tested functionally for its effects on gene regulation in live cells
and organisms. Systems-biology studies of network sensitivity and noise propagation57 can
then help to formulate the dynamics of deterministic and stochastic components53. The latter will identify ‘hotspots’ of systems’ flexibility
that are relevant to morphological evolution.
Direct experimental manipulation of such
hotspots might then enable us to ‘replay’ key
evolutionary processes within the lifetime
of single experimental organisms.
There might be some initial disappointment
that nature neither constructed its regulatory
circuits with an engineer’s intelligence nor used
Occam’s razor, whereas we must use both to
describe it. Historians can help us decipher
when and how new information was introduced, and into which old (and imperfect)
information networks. As information networks are corridors of constraints, they depend
on the states of their predecessors, subject to
modification by stochastic forces. By placing
the robustness and fragility12 of regulatory systems in a historical context, palaeontologists
can identify phases of ontogenetic ‘experimentation’ on a larger phylogenetic timescale,
at the base of the major metazoan radiations,
and identify the key players and the (changing)
rules in deep time.
The links between evolution and systems
biology are tenuous at the moment because
of limitations in what we can measure. An
elegant recent study about tooth patterning
that links the mathematical modelling of signalling pathways with the comparative analysis
of evolutionary diversity points in such a new
direction58. Only a few metazoan systems are
known in which noise is harnessed to establish phenotypic diversity (for example, in Drosophila photoreceptor choice59) or to capacitate
evolutionary change (such as Hsp90)2. This
shall not discourage us. Powerful tools will be
available to assess the effect of a few regulatory molecules in non-equilibrium states60 on
whole-organism structure and evolution.
We can now see more than Darwin could
ever have imagined. Comparing species genetically helps us see similarities and differences on
each organizational level. The analytical tools
are almost in place to integrate data sets into an
edifice of human knowledge that transcends
our inherited intuitive myopia. More genome
detectives should engage with the historians of
life in an effort to connect the dots and enter
the quest for the historical outlines of the elephant, rich and strange.
■
Georgy Koentges is experimental co-director of
the Warwick Systems Biology Centre, University
of Warwick, Coventry CV4 7AL, UK.
e-mail:
[email protected]
1. Rumi, Mawlana Jalaluddin The Elephant in the Dark
Room (13th cent.) in Mathnawi-yi ma’nawi (Masnavi) III
1259–1268 (ed. Nicholson, R. A.) (London 1925–1940).
2. Rutherford, S. L. & Lindquist, S. Nature 396, 336–342 (1998).
3. Bridgham, J. T., Carroll, S. M. & Thornton, J. W. Science 312,
97–101 (2006).
4. Wray, G. A. et al. Mol. Biol. Evol. 20, 1377–1419 (2003).
5. Pearl, J. Causality: Models, Reasoning, and Inference
(Cambridge Univ. Press, 2000).
6. Guido, N. J. et al. Nature 439, 856–860 (2006).
7. Raff, R. A. Nature Rev. Genet. 8, 911–920 (2007).
8. Janvier, P. Early Vertebrates (Oxford Univ. Press, 1996).
9. Witmer, L. M. in Functional Morphology in Vertebrate
Paleontology (ed. Thomason, J. J. ) 19–33 (Cambridge Univ.
Press, New York, 1995).
10. Trinajstic, K., Marshall, C., Long, J. & Bifield, K. Biol. Lett. 3,
197–200 (2007).
11. Donoghue, P. C. et al. Nature 442, 680–683 (2006).
12. Stelling, J., Sauer, U., Szallasi, Z., Doyle, F. J. III & Doyle, J.
Cell 118, 675–685 (2004).
13. Martinez-Arias, A. & Lawrence, P. A. Nature 313, 639–642
(1985).
14. Lumsden, A. & Keynes, R. Nature 337, 424–428 (1989).
15. Larsen, C. W., Zeltser, L. M. & Lumsden, A. J. Neurosci. 21,
4699–4711 (2001).
16. Koentges, G. & Lumsden, A. Development 122, 3229–3242
(1996).
17. Matsuoka, T. et al. Nature 436, 347–355 (2005).
18. Gu, H., Marth, J. D., Orban, P. C., Mossmann, H. &
Rajewsky, K. Science 265, 103–106 (1994).
19. Huang, R., Zhi, Q., Patel, K., Wilting, J. & Christ, B.
Development 127, 3789–3794 (2000).
20. Balciunas, D. et al. PLoS Genet. 2, e169 (2006).
21. Ogino, H., McConnell, W. B. & Grainger, R. M. Nature
Protocols 1, 1703–1710 (2006).
22. Lois, C., Hong, E. J., Pease, S., Brown, E. J. & Baltimore, D.
Science 295, 868–872 (2002).
23. Davidson, E. H. The Regulatory Genome: Gene Regulatory
Networks in Development and Evolution (Academic Press/
Elsevier, San Diego, 2006).
24. Kirschner, M. & Gerhart, J. Proc. Natl Acad. Sci. USA 95,
8420–8427 (1998).
25. Abouheif, E. & Wray, G. A. Science 297, 249–252 (2002).
26. Freitas, R., Zhang, G. & Cohn, M. J. Nature 442, 1033–1037
(2006).
27. Carr, J. L., Shashikant, C. S., Bailey, W. J. & Ruddle, F. H.
J. Exp. Zool. 280, 73–85 (1998).
28. Keys, D. N. et al. Science 283, 532–534 (1999).
29. Prud’homme, B., Gompel, N. & Carroll, S. B. Proc. Natl Acad.
Sci. USA 104 (suppl. 1), 8605–8612 (2007).
30. Ronshaugen, M., McGinnis, N. & McGinnis, W. Nature 415,
914–917 (2002).
31. Tietjen, I. et al. Neuron 38, 161–175 (2003).
32. Abzhanov, A. et al. Nature 442, 563–567 (2006).
33. Hafner, M. et al. Methods 44, 3–12 (2008).
34. Friedman, N. Science 303, 799–805 (2004).
35. Pournara, I. & Wernisch, L. Bioinformatics 20, 2934–2942
(2004).
36. Hansen, A., Ott, S. & Koentges, G. Genome Inform. 15,
141–150 (2004).
37. Stark, A. et al. Nature 450, 219–232 (2007).
38. Istrail, S. & Davidson, E. H. Proc. Natl Acad. Sci. USA 102,
4954–4959 (2005).
39. Tsong, A. E., Tuch, B. B., Li, H. & Johnson, A. D. Nature 443,
415–420 (2006).
40. Faro-Trindade, I. & Cook, P. R. Biochem. Soc. Trans. 34,
1133–1137 (2006).
41. Maniatis, T. et al. Cold Spring Harb. Symp. Quant. Biol. 63,
609–620 (1998).
42. Arnosti, D. N. & Kulkarni, M. M. J. Cell. Biochem. 94,
890–898 (2005).
43. Smith, J., Theodoris, C. & Davidson, E. H. Science 318,
794–797 (2007).
44. Kmita, M. & Duboule, D. Science 301, 331–333 (2003).
45. Damle, S., Hanser, B., Davidson, E. H. & Fraser, S. E.
Dev. Biol. 299, 543–550 (2006).
46. Roure, A. et al. PLoS One 2, e916 (2007).
47. Pennacchio, L. A. et al. Nature 444, 499–502 (2006).
48. Chen, J. C., Ramachandran, R. & Goldhamer, D. J. Dev. Biol.
245, 213–223 (2002).
49. Alon, U. An Introduction to Systems Biology: Design Principles
of Biological Circuits (Chapman & Hall, 2006).
50. Akam, M. Curr. Biol. 8, R676–R678 (1998).
51. Weatherbee, S. D. et al. Curr. Biol. 9, 109–115 (1999).
52. Gehring, W. J. Zoology 104, 171–183 (2001).
53. Blake, W. J., Kærn, M., Cantor, C. R. & Collins, J. J. Nature
422, 633–637 (2003).
54. Beckers, J., Gérard, M. & Duboule, D. Dev. Biol. 180,
543–553 (1996).
55. Tian, J. et al. Nature 432, 1050–1054 (2004).
56. Hittinger, C. T. & Carroll, S. B. Nature 449, 677–681 (2007).
57. Raser, J. M. & O’Shea, E. K. Science 309, 2010–2013 (2005).
58. Kavanagh, K. D., Evans, A. R. & Jernvall, J. Nature 449,
427–432 (2007).
59. Wernet, M. F. et al. Nature 440, 174–180 (2006).
60. Nicodemi, M. & Prisco, A. Phys. Rev. Lett. 98, 108104 (2007).
Acknowledgements I thank Keith Vance, Sascha
Ott and Per Ahlberg for discussions. I gratefully
acknowledge advice from Sheila Canby (British
Museum) about Persian miniatures and Rumi. Benoît
Junod, curator at the Aga Khan Trust for Culture in
Geneva, provided the photograph of the elephant. I
apologize to my colleagues for not being able to cite all
relevant publications. Work in my laboratory is funded
by the Wellcome Trust, the Human Frontier Science
Program, the ARC, ConquerChiari and the BBSRC.
663
View publication stats