Molecular Plant Taxonomy (2020) (015-045)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Chapter 1

Plant Taxonomy: A Historical Perspective, Current


Challenges, and Perspectives
Germinal Rouhan and Myriam Gaudeul

Abstract
Taxonomy is the science that explores, describes, names, and classifies all organisms. In this introductory
chapter, we highlight the major historical steps in the elaboration of this science, which provides baseline
data for all fields of biology and plays a vital role for society but is also an independent, complex, and sound
hypothesis-driven scientific discipline.
In a first part, we underline that plant taxonomy is one of the earliest scientific disciplines that emerged
thousands of years ago, even before the important contributions of the Greeks and Romans (e.g., Theo-
phrastus, Pliny the Elder, and Dioscorides). In the fifteenth–sixteenth centuries, plant taxonomy benefited
from the Great Navigations, the invention of the printing press, the creation of botanic gardens, and the use
of the drying technique to preserve plant specimens. In parallel with the growing body of morpho-
anatomical data, subsequent major steps in the history of plant taxonomy include the emergence of the
concept of natural classification, the adoption of the binomial naming system (with the major role of
Linnaeus) and other universal rules for the naming of plants, the formulation of the principle of subordina-
tion of characters, and the advent of the evolutionary thought. More recently, the cladistic theory (initiated
by Hennig) and the rapid advances in DNA technologies allowed to infer phylogenies and to propose true
natural, genealogy-based classifications.
In a second part, we put the emphasis on the challenges that plant taxonomy faces nowadays. The still
very incomplete taxonomic knowledge of the worldwide flora (the so-called taxonomic impediment) is
seriously hampering conservation efforts that are especially crucial as biodiversity has entered its sixth
extinction crisis. It appears mainly due to insufficient funding, lack of taxonomic expertise, and lack of
communication and coordination. We then review recent initiatives to overcome these limitations and to
anticipate how taxonomy should and could evolve. In particular, the use of molecular data has been
era-splitting for taxonomy and may allow an accelerated pace of species discovery. We examine both
strengths and limitations of such techniques in comparison to morphology-based investigations, we give
broad recommendations on the use of molecular tools for plant taxonomy, and we highlight the need for an
integrative taxonomy based on evidence from multiple sources.

Key words Classification, Floras, DNA, History, Molecular taxonomy, Molecular techniques, Mor-
pho-anatomical investigations, Plant taxonomy, Species, Taxonomic impediment

Pascale Besse (ed.), Molecular Plant Taxonomy: Methods and Protocols, Methods in Molecular Biology, vol. 2222,
https://doi.org/10.1007/978-1-0716-0997-2_1, © Springer Science+Business Media, LLC, part of Springer Nature 2021

1
2 Germinal Rouhan and Myriam Gaudeul

1 Introduction

Adapting the famous aphorism of Theodosius Dobzhansky [1],


could we dare to say that nothing in biology makes sense except
in the light of taxonomy? Maybe yes, considering that most of
biology relies on identified—and so described—species that are
end products of taxonomy. Taxonomic information is obviously
crucial for studies that analyze the distribution of organisms on
Earth, since they need taxonomic names for inventories and
surveys. But names are also needed to report empirical results
from any other biological study dealing with, e.g., biochemistry,
cytology, ecology, genetics, or physiology: even if working an entire
life on a single species, e.g., Arabidopsis thaliana (L.) Heynh., a
molecular biologist will focus all his/her research on numerous
plants that all represent this species as delimited by taxonomy.
Thus, taxonomy provides names, but it is not only a ‘biodiversity-
naming’ service: it is also a scientific discipline requiring theoretical,
empirical, and epistemological rigor [2]. Names represent scientific
hypotheses on species boundaries, and to put forward such hypoth-
eses involves gathering information from characters of the organ-
isms and adopting a species concept (see Note 1 for an overview of
the main species concepts). Morphology, anatomy, and genetics are
the main sources of characters used in today’s plant taxonomy. Not
without noting that these types of characters all bring potentially
valuable evidence, the focus of this book is on the use of nucleic
acids—and genome size and chromosomes—for a reliable and
efficient taxonomy.
Before discussing how to choose genomic regions to be studied
in order to best deal with particular taxonomic issues (Chapter 2),
this chapter aims to summarize the history of taxonomy and to
highlight that plant molecular taxonomy emerged from an ancient
discipline that has been, and is still, central to other scientific
disciplines and plays a vital role for society. We will also give a
brief overview of the general background into which plant taxon-
omy is performed today and propose some general considerations
about molecular taxonomy.

2 Taxonomy and Taxon: Terminology and Fluctuating Meanings

It is not before 1813 that the Swiss botanist Augustin Pyramus De


Candolle (1778–1841) invented the neologism ‘taxonomy’ from
the Greek ταξις (order) and νoμoς (law, rule) and published it for
the first time in his book Théorie élémentaire de la Botanique (‘Ele-
mentary Theory of Botany,’ [3]). He defined this scientific disci-
pline as the ‘theory of the classifications applied to the vegetal
kingdom,’ which he considered as one of the three components
Plant Taxonomy History and Prospects 3

of botany along with glossology—‘the knowledge of the terms used


to name plant organs,’ and phytography—‘the description of plants
in the most useful way for the progress of science.’
Much later, the Global Biodiversity Assessment of the United
Nations Environment Programme (UNEP; [4]) defined taxonomy
as ‘the theory and practice of classifying organisms,’ including the
classification itself but also the delimitation and description of taxa,
their naming, and the rules that govern the scientific nomenclature.
Today, depending on the authors, taxonomy is viewed either as a
synonym for the ‘systematics’ science—also called biosystematics
[5, 6]—including the task of classifying species, or only as a com-
ponent of systematics restricted to the delimitation, description,
and identification of species. This latter meaning of taxonomy
emerged lately, with the advent of phylogenetics as another com-
ponent of systematics that allows classifications based on the evolu-
tionary relationships among taxa [7].
Thus, it is ironical that taxonomy and systematics, which deal in
particular with classifications and relationships between organisms,
often themselves require clarifications on their relative circumscrip-
tions and meanings before being used [8]. This book will consider
plant taxonomy in the broadest sense, from, e.g., species delimita-
tion based on different molecular techniques—to focuses on popu-
lation genomics methods, or studies resolving interspecific and
intergeneric hybrids.
Incidentally, it is interesting to note that the word ‘taxon’—
plural: taxa—was invented much later (Lam, in [9]) than ‘taxon-
omy’: a taxon is a theoretical entity intended to replace terms such
as ‘taxonomic group’ or ‘biodiversity unit’ [10], and ‘taxon’ refers
to a group of any rank in the hierarchical classification, e.g., species,
genus, or family.

3 A Historical Perspective to Plant Taxonomy

3.1 One Delimiting, describing, naming, and classifying organisms are activ-
of the Earliest ities whose origins are obviously much older than the word ‘taxon-
Scientific Disciplines omy’—which dates back to the nineteenth century; see above. The
use of oral classification systems likely even predated the invention
of the written language ca. 5600 years ago. Then, as for all vernac-
ular classifications, the precision of the words used to name plants
was notably higher for plants that were used by humans. There was
no try to link names and organisms in hierarchical classifications
since the known plants were all named following their use: some
were for food, others for medicines, poisons, or materials. As early
as that time, several hundreds of plant organisms of various kinds
were identified, while relatively few animals were known and
named—basically those that were hunted or feared [11].
4 Germinal Rouhan and Myriam Gaudeul

These early classifications, that were exclusively utilitarian, per-


sisted until the fifteenth–sixteenth centuries although some major
advances were achieved, mainly by ancient Greeks and Romans. It
was perceptible that the Greeks early considered plants not just as
useful, but also as beautiful, taking a look at paintings in Knossos
(1900 BC) that indeed show useful plants like barley, fig, and olive,
but also narcissus, roses, and lilies. The Greek Theophrastus
(372–287 BC), famous as the successor of Aristotle at the head of
the Lyceum, is especially well known as the first botanist and the
author of the first written works on plants. Interested in naming
plants and finding an order in the diversity of plants, he could have
been inspired by Aristotle who started his Metaphysics book with the
sentence: ‘All men by nature desire to know.’ Theophrastus is
indeed the first one to provide us with a philosophical overview of
plants, pointing out important fundamental questions for the
development of what will be later called taxonomy, such as ‘what
have we got?’ or ‘how do we differentiate between these things?’
He was moreover the first one to discuss relationships among
plants, and to suggest ways to group them not just based on their
usefulness or uses. Thus, in his book Enquiry into Plants, he
described ca. 500 plants—probably representing all known plants
at that time—that he classified as trees, shrubs, undershrubs, and
herbs. He also established a distinction between flowering and
nonflowering plants, between deciduous and evergreen trees, and
between plants that grew in water and those that did not. Even if
80% of the plants included in his works were cultivated, he had
realized that ‘most of the wild kinds have no names, and few know
about them,’ highlighting the need to recognize, describe, and
name plants growing in the wild [12]. Observing and describing
the known plants, he identified many characters that were valuable
for later classifications. For instance, based on his observations of
plants sharing similar inflorescences—later named ‘umbels’—he
understood that, generally, floral morphology could help to cluster
plants into natural groups and, several centuries later, most of these
plants showing umbels were indeed grouped in the family Umbel-
liferae—nowadays Apiaceae.
Theophrastus was way ahead of his time, to such a point that his
botanical ideas and concepts became lost during many centuries in
Europe. But his works survived in Persia and Arabia, before being
translated back into Greek and Latin and rediscovered in Europe in
the fifteenth century. During this long Dark Age for botany—like
for all other natural sciences—in Europe, the Roman Pliny the
Elder (23–79 AD) and the Greek Dioscorides (~40–90 AD), in
first century AD, have however been two important figures.
Although they did not improve the existing knowledge and meth-
ods about the description, naming, or classifications of plants, they
compiled the available knowledge and their written works were
renowned and widely used. The Naturalis Historia of Pliny
Plant Taxonomy History and Prospects 5

(77 AD) was indeed a rich encyclopedia of the natural world,


gathering 20,000 facts and observations reported by other authors,
mostly from Greeks like Theophrastus. At the same time in Greece,
plants were almost only considered and classified in terms of their
medical properties. The major work of Dioscorides De Materia
Medica (ca. 77 AD) was long the sole source of botanical informa-
tion (but at that time, botany was only considered in terms of
pharmacology) and was repeatedly copied until the fifteenth cen-
tury in Europe. Juliana’s book—Juliana Anicia Codex, sixth cen-
tury; Fig. 1—is the most famous of these copies, well known
because it innovated by adding beautiful and colorful plants illus-
trations to the written work of Dioscorides. If some paintings could
be seen as good visual aids to identification—which should be
considered as an advance for taxonomy—others, however, were
fanciful [12]. All those plant books, called ‘herbals’ and used by
herbalists—who had some knowledge about remedies extracted
from plants—throughout the Middle Ages, did not bring any
other substantial progress.

3.2 Toward With the Renaissance, the fifteenth and sixteenth centuries saw the
a Scientific beginning of the Great Navigations—e.g., C. Columbus discovered
Classification of Plants the New World from 1492; Vasco da Gama sailed all around Africa
to India from 1497; F. Magellan completed the first circumnavi-
gation of Earth in 1522—allowing to start intensive and large-scale
naturalist explorations around the world: most of the major terri-
tories, except Australia and New Zealand, were discovered as soon
as the middle of the sixteenth century, greatly increasing the num-
ber of plants that were brought back in Europe either by sailors
themselves or naturalists on board. At that time, herbalists still
played a major role in naming and describing plants, in association
with illustrators who were producing realistic illustrations. But
naming and classifying so numerous exotic and unknown plants
from the entire world would not have been possible without three
major inventions. Firstly, the invention of the Gutenberg’s printing
press with moveable type system (1450–1455) made written works
on plants largely available in Europe—the first Latin translation of
Theophrastus’ books came out in 1483. Secondly, the first botanic
gardens were created in Italy in the 1540s, showing the increasing
interest of the population for plants and allowing teaching botany.
Thirdly, in the botanic garden of Pisa, the Italian Luca Ghini
(1490–1556) invented a revolutionary method for preserving—
and so studying—plants, consisting in drying and pressing plants
to permanently store them in books as ‘hortus siccus’ (dried gar-
den), today known as ‘herbaria’—or ‘herbarium specimens.’ These
perennial collections of dried plants were—and are still—a keystone
element for plant taxonomy and its development: from that time,
any observation and experimental result could be linked to specific
plant specimens available for further identification, study of
6 Germinal Rouhan and Myriam Gaudeul

Fig. 1 Painting of a Cyclamen plant, taken from the Juliana’s book, showing the flowering stems rising from
the upper surface of the rounded corm. According to Dioscorides, those plants were used as purgatives,
antitoxins, skin cleansers, labor inducers, and aphrodisiacs

morphology, geographic distribution, ecology, or any other fea-


tures. In short, Ghini provided with herbaria the basis of reproduc-
ibility that is an essential part of the scientific method [13].
A student of Ghini, Andrea Cesalpino (1519–1603), was the
first one since the Ancient Greeks to take over the work of Theo-
phrastus, and to discuss it. He highlighted that plants should be
Plant Taxonomy History and Prospects 7

classified in a more natural and rational way than the solely utilitar-
ian thinking. Convinced that all plants have to reproduce, he
provided a new classification system primarily based on seeds and
fruits: in De Plantis libri XVI (1583), he described 1500 plants that
he organized into 32 groups such as the Umbelliferae and Compo-
sitae—currently Apiaceae and Asteraceae, respectively. Cesalpino
also made a contribution to the naming of plant names, sometimes
adding adjectives to nouns designing a plant, e.g., he distinguished
Edera spinosa (spiny ivy) from Edera terrestris (creeping ivy). This
could be seen as a prefiguration of the binomial naming system that
was established in the eighteenth century and is still used in taxon-
omy. But the science of scientific naming was only starting and
plants—like other living beings—were usually characterized by
several words forming polynomial Latin names: for instance,
tomato was designed as Solanum caule inermi herbaceo, foliis pin-
natis incisis, which means ‘Solanum with a smooth herbaceous
stem and incised pinnate leaves’ [14] (Fig. 2).
Cesalpino contributed to the emergence of the concept of
natural classification, i.e., a classification reflecting the ‘order of
Nature.’ This latter expression involved different interpretations
and classifications through the history of taxonomy, but a natural
classification was always intended to reflect the relationships among
plants. Because the Evolutionary thought was not developed yet, it
basically resulted in clustering plants with similar morphological
features. So, it must be noted that the distinction between artificial
and natural classifications—respectively named ‘systems’ and
‘methods’ at the end of the eighteenth century—is a modern
interpretation of the past classifications. Taking advantage of both
technical progresses like microscopy—in the seventeenth century—
and scientific methods inspired by Descartes (1596–1650), several
attempts were made to reach such a natural classification. For
example, Bachmann—also known as Rivin or Rivinus
(1652–1723)—based his classification on the corolla shape in
Introductio ad rem herbariam in 1690. Altogether, the major inter-
est of these classifications is that they triggered investigations on
many morpho-anatomical characters that could be used by later
taxonomists to describe and circumscribe plant species. The British
John Ray (1627–1705) innovated by not relying anymore on a
single characteristic to constitute groups of plants: he suggested
natural groupings ‘from the likeliness and agreement of the princi-
pal parts’ of the plants, based on many characters—mostly relative
to leaves, flowers, and fruits. He documented more than 17,000
worldwide species in Historia Plantarum (1686–1704) and distin-
guished flowering vs. nonflowering plants, and plants with one
cotyledon, which he named ‘monocotyledons,’ vs. plants with
two cotyledons, ‘dicotyledons.’ Ray also played a major role in
the development of plant taxonomy—and more generally of plant
science—by creating the first text-based dichotomous keys that he
used as a means to classify plants [15].
8 Germinal Rouhan and Myriam Gaudeul

Fig. 2 Herbarium specimen from the Tournefort’s Herbarium (housed at the Paris
national Herbarium, Muséum national d’Histoire naturelle, MNHN) displaying a
label with the hand-written polynomial name ‘Aconitum caeruleum, glabrum,
floribus consolid(ae) regalis’

In contrast to Ray and his method intended to be natural, his


French contemporary Joseph Pitton de Tournefort (1656–1708)
explored, in his Elements de Botanique (1694), the possibility of
classifying plants based on only a few characteristics related to the
corolla of flowers, creating an artificial system. The success of
Tournefort’s system resulted from the ease to identify groups of
plants based on the number and relative symmetry of the petals of a
Plant Taxonomy History and Prospects 9

flower. Within his system, Tournefort precisely defined 698 enti-


ties—‘Institutiones rei herbariae,’ 1700—each being called a genus,
plural: genera. The genus concept was new and contributed to a
better structuration of the classification.

3.3 Naming Plant In spite of the numerous new ideas and systems produced from the
Names: Major 16th to the middle of the eighteenth century, names of plants still
Advances by Linnaeus consisted in polynomial Latin names, i.e., a succession of descrip-
tors following the generic name. This led to a rather long, compli-
cated, and inoperative means to designate plants and became
problematic in the context of the Great Explorations, which
allowed the discovery of more and more plants from all over the
world (major explorations with naturalists on board included, e.g.,
the circumnavigation of La Boudeuse under Bougainville from
1766 to 1769, and the travels to the Pacific of J. Cook between
1768 and 1779). To overcome this impediment involving the
naming of plants, the Swedish Carolus Linnaeus (1707–1778)
took a critical step forward for the development of taxonomy.
He suggested dissociating the descriptors of the plant from the
name itself, because according to him, the name should only serve
to designate the plant. Therefore, he assigned a ‘trivial name’ to
each plant (more than 6000 plants in Species Plantarum, 1753)
[16] and this name was binomial, only consisting of two words: the
‘genus’ followed by the ‘species,’ e.g., Adiantum capillus-veneris is
a binomen created by Linnaeus that is still known and used as such
to designate the Venus-hair fern. Although there had been some
attempts of binomials as early as Theophrastus (followed by Cesal-
pino and a few others), Linnaeus succeeded in popularizing his
system as new, universal—applied for all plants and, later on, even
for animals in Systema Naturae [17], and long-lasting. Truly, Species
Plantarum [16] has been a starting point for setting rules in plant
taxonomy. Used since Linnaeus until today, the binomial system
along with other principles for the naming of plants were devel-
oped, standardized, synthesized, and formally accepted by taxono-
mists into a code of nomenclature—initially called ‘Laws of
botanical nomenclature’ [18] and nowadays called the Interna-
tional Code of Nomenclature for algae, fungi, and plants (ICN).
The current code is slightly evolving every 6 years, after revisions
are adopted at an international botanical congress.
Linnaeus also proposed his own artificial classification. With the
goal to describe and classify all plants—and other living beings—
that were ‘put on Earth by the Creator,’ he grouped them based on
the number and arrangement of stamens and pistils within flow-
ers—contrary to Tournefort, who only focused on petals. He called
this classification a ‘sexual system,’ referring to the fundamental
role of flowers in sexual reproduction (Fig. 3). This system included
five hierarchical categories: varieties, species, genera, orders—
equivalent to current families, and classes.
10 Germinal Rouhan and Myriam Gaudeul

Fig. 3 Linnaeus’s sexual system as drawn by G. D. Ehret for the Hortus Cliffortianus (1735–1748); this
illustration shows the 24 classes of plants that were defined by Linnaeus according to the number and
arrangements of stamens
Plant Taxonomy History and Prospects 11

3.4 The Advent The end of the eighteenth century was conducive to revolutionary
of the Theory ideas in France, including new principles to reach the natural classi-
of Evolution and Its fication. Studying how to arrange plants in space for creating the
Decisive Impact new royal garden of the Trianon in the Palace of Versailles, Bernard
on Taxonomy de Jussieu (1699–1777) applied the key principle of subordination
of characters, which will be published in 1789 by his nephew
Antoine Laurent de Jussieu (1748–1836) in Genera Plantarum
[19]. Bernard and A. L. de Jussieu stated that a species, genus, or
any other taxon of the hierarchical classification should group
plants showing character constancy within the given taxon, as
opposed to the character variability observed among taxa. Since
not all characters are useful at the same level of the classification, the
principle of subordination led to a character hierarchy: characters
displaying higher variability should be given less weight than more
conserved ones in plant classifications. As a result, B. and A. L. de
Jussieu subordinated the characters of flowers—judged more vari-
able and therefore less suitable at higher levels—to the more con-
served characters of seeds and embryos. It was the first application
of this principle in taxonomy, and it could be interpreted today as a
way to limit homoplasy, though the concept of homoplasy had not
been elaborated yet [20].
Whereas botanical taxonomy had long been preponderant and
faster in its development than its zoological counterpart, the trend
was reversed at the beginning of the nineteenth century, especially
with the application of the principle of subordination of characters
to animals by the French biologists Jean-Baptiste de Lamarck
(1744–1829) and Georges Cuvier (1769–1832). New questions
then arose in the mind of taxonomists, who were not only inter-
ested in naming, describing, and classifying organisms anymore,
but also in elucidating how the observed diversity had been gener-
ated. Early explanatory theories included the theory of the trans-
mutation of species, proposed by Jean-Baptiste de Lamarck in 1809
in his Philosophie zoologique [21]. This was the first theory to
suggest the evolution of species, although it involved several mis-
leading assumptions such as the notion of spontaneous genera-
tions. Charles Darwin (1809–1882) published his famous theory
of evolution in On the Origin of Species (1859) [22], and intro-
duced the central concept of descent with modification that later
received extensive support and is still accepted today. This implied
that useful characters in taxonomy, the so-called homologous char-
acters, are those inherited from a common ancestor. Darwin indeed
predicted that ‘our classifications will come to be, as far as they can
be so made, genealogies’ (Darwin 1859, p. 486) [22]. In other
words, since the history of life is unique, only one natural classifica-
tion is possible that reflects the phylogeny. This latter word was
however not coined by Darwin himself, but in 1866 in his Generelle
Morphologie der Organismen [23] by Ernst Haeckel (1834–1919),
who is commonly known for the first illustration of a phylogeny,
12 Germinal Rouhan and Myriam Gaudeul

although Dayrat [24] evidenced that all Haeckel’s illustrations


should not be interpreted as real evolutionary-based phylogenetic
trees [23] (Fig. 4). However, Darwin did not provide any new
techniques or approaches to reconstruct the phylogeny or assist
practicing taxonomists in their work [25] and, in spite of his major
contributions, plant taxonomists therefore kept applying the
method of classification described by B. and A. L. de Jussieu even
after the onset of the evolutionary thought.

3.5 New Methods In the 1960s, facing the subjectivity of the existing methods to
and New Sources reconstruct phylogenies, the new concept of numerical taxonomy
of Characters proposed an entirely new way of examining relationships among
for a Modern taxa. Robert Sokal (1926–2012) and Peter Sneath (1923–2011)
Taxonomy started developing this concept in 1963 [26], and elaborated it as
an objective method of classification. The method consisted in a
quantitative analysis of overall similarities between taxa, based on a
characters-by-taxa data matrix—with characters divided into char-
acter states—and resulting in pairwise distances among taxa. But
this method was not based on any evolutionary theory and the
resulting diagrams could therefore not be reasonably interpreted
in an evolutionary context, or as an evolutionary classification.
Nevertheless, this theory flourished for a while, greatly benefiting
from rapid advances in informatics.
A crucial change in the way botanists practice taxonomy
occurred with the development of the cladistic theory and recon-
struction of phylogenies—using diagrams called cladograms—to
infer the evolutionary history of taxa. Willi Hennig (1913–1976)
initiated this revolution with his book Grundzüge einer Theorie der
Phylogenetischen Systematik, published in 1950 [27], but his ideas
were much more widely diffused in 1966 with the English transla-
tion entitled Phylogenetic Systematics [28]. The primary principle of
cladism, or cladistics, is not to use the overall similarity among taxa
to reconstruct the phylogeny, since similarity does not necessarily
reflect an actual close evolutionary relationship. Instead, Hennig
only based the phylogenetic classification on derived characters,
i.e., the characters that are only inherited from the last common
ancestor to two taxa—as opposed to the primitive characters. Every
taxonomic decision, from a species definition to a system of higher
classification, was to be treated as a provisional hypothesis, poten-
tially falsifiable by new data [29]. This new method benefited from
an increasing diversity of sources of characters to be considered,
thanks to the important technological advances accomplished in
the 1940s and 1950s in cytology, ecology, and especially in
genetics.
The discovery of the double helical structure of the DNA
molecule in 1953, by James Watson and Francis Crick, followed
by the possibility to target specific fragments of the genome for
selectively amplifying DNA—the Polymerase Chain Reaction
Plant Taxonomy History and Prospects 13

Fig. 4 Illustration from ‘Monophyletischer Stammbaum der Organismen’ (Haeckel 1866): plants form one of
the three main branches of the monophyletic genealogical tree of organisms
14 Germinal Rouhan and Myriam Gaudeul

(PCR) was invented by Karry Mullis in 1986 [30]—have dramati-


cally changed biology. In particular, the introduction of DNA
sequence data has been era-splitting for plant taxonomy, offering
access to numerous characters and statistical approaches. Thus, at
the turn of the twenty-first century, the use of molecular data and
new tree-building algorithms—with probabilistic approaches—led
the Angiosperm Phylogeny Group (APG) to better circumscribe all
orders and families of flowering plants [31–34]. Similarly, the Pte-
ridophyte Phylogeny Group reached a consensual classification for
free-sporing vascular plants (ferns and lycophytes) to the genus
level [35]. Such collaborative initiatives have improved to a great
extent our understanding of the plant classification based on evo-
lutionary relationships. Many long-standing views of deep-level
relationships were drastically modified at the ordinal level, and to
a lesser extent at the familial level in flowering plants. One of the
most striking changes is the abandonment of the long-recognized
monocot-dicot split, since monocots—class Liliopsida—were
found to be derived from within a basal grade of families that
were traditionally considered as dicots—class Magnoliopsida.
Another outstanding finding resulting from analyses of molecular
data has been that horsetails and ferns together are the closest
relatives to seed plants, necessitating the abandonment of the pre-
vailing view that ferns and horsetails represent paraphyletic succes-
sive grades of increasing complexity in early vascular plant
evolution, which eventually led to the more complex seed plants,
and ultimately to angiosperms [36]. Thus, the more or less intuitive
classifications proposed since the beginning of the twentieth cen-
tury [37–41] have progressively been less used, as a consequence of
the modifications brought to the classification by molecular
results [42].
Taxonomy took advantage of molecular data not only for
improving plant classification or species delineation, but also for
species-level identification with the development of the DNA bar-
coding initiative since the early 2000s. DNA barcoding is based on
the premise that a short standardized DNA sequence can allow
distinguishing individuals of different species because genetic vari-
ation between species is expected to exceed that within species. It
was first promoted by Paul Hebert for animals [43] and later
supported by international alliances of research organizations like
the Consortium for Barcoding of Life (CBOL; http://barcoding.si.
edu), which includes a Plant working group, or the China Plant
Barcoding of Life Group.
The long history of plant taxonomic research and its numerous
contributors, both for theoretical concepts and the practical accu-
mulation of knowledge, allowed the development of an indepen-
dent, complex, and sound hypothesis-driven scientific discipline
that explores, describes, documents the distribution of, and classi-
fies taxa. It is clearly not restricted to, e.g., identifying specimens
Plant Taxonomy History and Prospects 15

and establishing species lists, but it nevertheless also provides basic


knowledge that is required to address a wide range of research
questions and serve stakeholders in government agencies and inter-
national biodiversity organizations (for management of agriculture
pests, development of new pharmaceutical compounds, control of
trade in endangered species, management of natural resources, etc.;
[29, 44–48]). However, taxonomy is faced with the enormous
existing plant diversity, and one still unanswered question resides
in the extent of plant diversity: how many species are there on
Earth?

4 Plant Taxonomy Today: Current Challenges, Methods, and Perspectives

4.1 How Many Plant Linnaeus’ Species Plantarum, published in 1753, was one of the
Species Are There? first key attempts to document the diversity of plants on a global
scale [16]. In this work, Linnaeus recognized more than 6000
species but erroneously concluded that ‘the number of plants in
the whole world is much less than commonly believed, I ascertained
by fairly safe calculation [. . .] it hardly reaches 10,000’ [16]. Later
on, in 1824, the Swiss A.P. de Candolle, in his Prodromus Systematis
Naturalis Regni Vegetabilis [49], aimed to produce a flora of the
world: he included 58,000 species in seven volumes. Today, we
know that the magnitude of plant diversity is much larger, although
we are uncertain of the exact number of plant species.
There are two questions in estimating the total number of plant
species: the first one is how many species have already been described;
the second one is how many more species are presently unknown to
science.
Our uncertainty about the number of described species is
mostly due to the fact that taxonomists sometimes gave different
names to the same species inadvertently, especially in the past due to
poor communication means between distant scientists. This led to
the existence of multiple names for a single biological entity, a
phenomenon called synonymy. As a consequence, we know that
more than 1,064,908 vascular plant names were published, as
evidenced by the International Plant Names Index (IPNI)
[50, 51], but they would actually represent only 223,000 to
422,000 accepted species—depending on the method of calcula-
tion ([46, 52] and references therein, [53, 54]), with the most
recent estimates of 383,671 [51] and 351,176 according to
The Leipzig Catalogue of Vascular Plants (LCVP) v.1.0.2 by
Freiberg et al. (unpublished). In addition, the disagreement on a
single species concept (see Note 1) among plant taxonomists means
that species counts can easily differ by an order of magnitude or
more when the same data are examined by different botanists
[55]. This leads to a taxonomic inflation, i.e., an increased number
of species in a given group that is not due to an actual discovery of
16 Germinal Rouhan and Myriam Gaudeul

new species [56–58]. In practice, this can occur when, e.g., differ-
ent botanists do not recognize the same number of species in a
given taxonomic group—the ‘splitters’ vs. the ‘lumpers’—or when
one botanist describes subspecies while another one elevates them
to the rank of species.
The estimation of the total number of plant species on Earth is
also obviously hampered by our uncertainty about the extent of the
unknown plant diversity: how many more species there are to
discover? The exploration of plant diversity allows the description
of ca. 2000 new plant species every year [46, 47, 59] although part
of which may turn out to be synonyms based on future thorough
monographic revisions. Based on a model of the rates of plant
species description, Joppa et al. [60] estimated that there should
be an increase of 10–20% in the current number of flowering plant
species. This means that, based on the estimation of 352,000
flowering plant species [46], they predicted the actual diversity
between 390,000 and 420,000 species for this group. Meanwhile,
Mora et al. [61] used higher taxonomy data, i.e., they extrapolated
the global number of plant species based on the strong negative
correlation between the taxonomic rank and the number of higher
taxa—which is better known than the total number of species. As a
result, focusing on land plants, they suggested an expected increase
of 38% in the number of species, from 215,000 in Catalogue of Life
[62] to 298,000 predicted species.
These numbers make clear that our knowledge of plant diver-
sity is still very incomplete and that even estimates of its magnitude
remain highly controversial and speculative, highlighting the need
for more taxonomic studies.

4.2 Current Threats At the Sixth Conference of the Parties to the Convention on
on Plant Diversity, Biological Diversity (CBD) held in 2002, more than 180 countries
the Taxonomic adopted the Global Strategy for Plant Conservation (GSPC). It
Impediment, and Some included 16 specific targets that were to be achieved by 2010,
Initiatives with the goal to halt the loss of plant diversity [46]. The Strategy
to Overcome It was updated in 2010 (at the Tenth meeting of the Conference of
the Parties) and it is now implemented within the broader frame-
work of the Strategic Plan for Biodiversity 2011–2020. The first
and most fundamental target of the Strategy was initially to com-
plete a ‘widely accessible working list of known plant species, as a
step towards a complete world flora’ [46, 59]. After the completion
of this list in 2010 [53], Target 1 was slightly modified to develop-
ing ‘an online flora of all know plants’ (http://www.cbd.int/gspc/
targets.shtml; http://www.worldfloraonline.org/). This target
aims to provide baseline taxonomic information, i.e., a list of the
accepted names for all known plant species, linked to their syno-
nyms but also to biological information such as geographic distri-
bution and basic identification tools. Since species are basic units of
analysis in several areas of biogeography, ecology, and
Plant Taxonomy History and Prospects 17

macroevolution and also the currency for global biodiversity assess-


ments [63], the lack of such taxonomic information is a critical
bottleneck for research, conservation, and sustainable use of plant
diversity [46], and was called the ‘taxonomic impediment’ at the
Second Conference of the Parties to the CBD (decision II/8). This
is especially critical at a time when biodiversity faces its sixth extinc-
tion crisis: most newly described species occur in hotspots of diver-
sity, often in tropical dense forests, where protected areas are scarce,
the level of habitat destruction (due to anthropic activities) is high,
and the impact of climate change is strong. Newly described species
are also likely to be characterized by locally low abundance and
small geographic ranges, enhancing their risk of extinction
[64]. Therefore, botanists must engage into a race to describe and
name species before they go extinct. This is especially true since
plants still lag far behind many animal groups in contributing to
global conservation planning, despite their essential role in struc-
turing most ecosystems [65]. In addition to the major conservation
concern, there are a multitude of possible concrete examples of
beneficial application of taxonomic discovery such as the identifica-
tion of new wild species adaptable for agriculture, timber or fibers;
new genes for enhancement of crop productivity; and new classes of
pharmaceuticals. Also, basic taxonomic knowledge is a prerequisite
to monitor and anticipate the spread of invasive plants, and to
better understand ecosystem services [45, 47].
Several factors limit the efficiency of botanists in documenting
plant diversity. However, recent improvements and future optimis-
tic perspectives must also be underlined, and numerous contribu-
tions have been made to imagine and propose what the ‘Taxonomy
for the twenty-first century’ should and could be (see, e.g., [29, 66],
and the whole Theme Issue of the Philosophical Transactions of the
Royal Society of London, Series B that they coordinated; see also [67–
69]), and what roles, challenges, and opportunities should be for
the ‘Botanists of the twenty-first century’ (https://unesdoc.
unesco.org/ark:/48223/pf0000243791.locale¼fr).
First, one limiting factor is the general lack of funding and, in
particular, the lack of resources devoted to the basic field activity of
collecting new material [55, 66, 67, 70]. Field explorations are also
made difficult by practical limitations such as ease of access to
remote areas or safety concerns in some parts of the world that
may be politically unstable [44, 45, 59]. However, we currently
know a renewed age of exploration and discovery, supported by
several national or international initiatives. This is particularly true
in the United States, where the ‘Planetary Biodiversity Inventories:
Mission to an (Almost) Unknown Planet’ program, was launched
in 2003, aiming to complete the world species inventory for some
selected taxa, with individual project awards of ca. US$3 million
over a 5-year duration (http://nsf.gov/pubs/2006/nsf06500/
nsf06500.htm) [55, 71]. Other leading initiatives like ‘Our Planet
18 Germinal Rouhan and Myriam Gaudeul

Reviewed’ (http://www.laplaneterevisitee.org/en) much contrib-


ute to inventories in hotspots. In Europe, several major museums
and botanical gardens established the Consortium of European
Taxonomic Facilities (CETAF) in 1996, which in turn created the
European Distributed Institute of Taxonomy (EDIT) in 2006, for
a 5-year period, under the European Union sixth Framework
Programme. This worldwide network of excellence brought
together 29 leading European, but also North American and
Russian, institutions with the goal to increase both the scientific
basis and capacity for biodiversity conservation. Developing
countries also participated to this international effort by developing
similar national or multinational programs, e.g., in Brazil, Mexico,
and Africa [72]. On a global scale, the Global Taxonomic Initiative
(GTI) was launched in 1998 by the Conference of the Parties to the
CBD, and was later related to the GSPC, in order to remove or
reduce the ‘taxonomic impediment.’ In addition to institutional
breakthroughs, modern means of travel have facilitated access to
remote places where many species occur. As a result, although today
botanical expeditions could probably not be as prolific as those
reported during the great naturalists explorations of the eighteenth
and nineteenth centuries in terms of new species descriptions (e.g.,
in 1770, Sir Joseph Banks collected specimens representing as many
as 110 new genera and 1300 new species in Australia; White 1772
in [59]), important discoveries occurred in the recent past and
provide evidence for the vitality of contemporary botanists: for
instance, the Malagasy endemic Takhtajania perrieri (Capuron)
Baranova & J.-F.Leroy (Winteraceae) was first collected in 1909
and thought to have gone extinct, but was rediscovered in 1994
[73], i.e., almost 90 years after its first collection. Other examples
suggest that some showy, sometimes abundant plants still remain to
be described, even in geographical areas that are supposed to be
well prospected: a new genus and species of conifer, Wollemia
nobilis W.G.Jones, K.D.Hill & J.M.Allen, was observed in the
1990s only ca. 150 km from Sydney (Australia), and was shown
to belong to a well-known family of charismatic trees (Araucaria-
ceae), including only two other genera [74]. In 2007, Thulin and
collaborators reported the discovery of a conspicuous and domi-
nating tree in the Somali National Regional State (Ogaden) in
eastern Ethiopia [75, 76]. This tree, Acacia fumosa Thulin (Faba-
ceae), covers an area as large as Crete but was hitherto unknown to
science. The location of this species in an African war zone and the
inaccessibility of the area probably explain that it had never been
collected and remained undescribed so far. To cite a last example of
recent striking botanical discovery, we can mention the description
of a new palm genus and species from Madagascar, Tahina spect-
abilis J.Dransf. & Rakotoarin. [77]. The trees grow to 18 m high
and leaves reach 5 m in diameter, making them the most massive
palms ever found in Madagascar. However, the small census size
Plant Taxonomy History and Prospects 19

(less than a hundred individuals), limited habitat, and rare repro-


duction events lead to serious conservation concerns for the
species.
A second crucial issue for enhancing our knowledge of plant
diversity is the lack of taxonomic expertise. This is at least partly due
to the lack of credit given to works of descriptive taxonomy (e.g.,
species lists, floras, or monographs) compared to peer-reviewed
publications in high-impact journals [46, 70, 78, 79]. The global
number of species described over time has increased over the past
250 years [60, 80], but this remains clearly not sufficient to coun-
teract the increasing rate of species extinctions, and many species
are at risk of disappearing before being described. Although taxo-
nomists have most likely increased the efficiency of their efforts
since the mid-1700s, the involvement of more numerous people
into the tasks of exploring and describing the biodiversity is needed:
in the United States, the NSF’s Partnerships for Enhancing Exper-
tise in Taxonomy (PEET) program allowed the training of new
generations of taxonomists since 1995 [78, 81], and enjoyed much
success. In addition, in some regions (e.g., in Costa Rica or Papua
New Guinea), local people called ‘parataxonomists’ contribute to
specimens collection and species recognition based on rough mor-
phological criteria, in collaboration with taxonomic experts
[45]. This is also in line with the growing body of ‘citizen scien-
tists,’ who are often amateurs and offer their help to accumulate
data, e.g., on the presence/absence of a given species in a given
region, or the distribution of a morphological character across
space. Because they are usually organized as large networks, they
represent an immense and increasingly important workforce and
make possible some tasks that would otherwise not have been
possible because of, e.g., limited time and funding [80]. However,
sound knowledge and experience of professional taxonomists
remain critical [46, 66, 67, 70, 82] and capacity building in tropical
countries—where the greatest diversity of life is concentrated—
should therefore be a priority [59].
A third identified impediment to our taxonomic knowledge
was—and still is, to a certain extent—the problem of communica-
tion and coordination, of tracing the accumulated publication
records, of deciphering the complex synonymy, and of chasing the
scattered (and sometimes in poor condition) material, especially
type specimens that are housed in herbaria around the world
[59, 66, 67]. Worldwide natural history collections contain
390 million plant specimens [83] and their importance has recently
been made even more prominent by the finding that they house
many new species that remain to be described [84]: researchers
analyzing the time lapse between flowering plant sample collection
and new species recognition estimated that only 16% were
described within 5 years of being collected for the first time, and
that nearly 25% of new species descriptions involved specimens of
20 Germinal Rouhan and Myriam Gaudeul

more than 50 years old. The median time lag between the earliest
specimen collection and the publication of the new plant species
description was ca. 30 [84] or 32 years [85]. Such a lag time (also
called ‘shelf life’) is longer for herbarium specimens than for all
other taxonomic groups [85]. Natural History Collections thus act
as a reservoir of potential new species. Therefore, although one
limiting step of species discovery may be the capacity to undertake
field work (as suggested above), access and examination of existing
herbarium collections by experts are another bottleneck. This is
however now partly overcome by programs such as the European
SYNTHESIS (from 2004 to 2017 superseded by SYNTHESIS+ in
2019; https://www.synthesys.info/about-synthesys.html) that
provide funded researcher visits to specimens housed by diverse
institutions, or by increased international collaborations and a bet-
ter access to information and specimens, thanks to modern data-
sharing technologies [46–48, 61, 69]. As an example, a major step
was accomplished thanks to funding from the Andrew W. Mellon
Foundation and subsequent institutional commitments to database
and image name-bearing type specimens—on which the species
original descriptions are based—and deposit these data in the cen-
tral repository JSTOR Plant Science [82]. At an even larger scale,
several major herbaria—including the Paris Herbarium, which is
one of the biggest/richest in the world with ca. eight million
specimens [86]—achieved large-scale digitization of all their vascu-
lar plant specimens, in order to make them freely available as high-
quality photographs on the web—through both the herbarium
database https://science.mnhn.fr/ and the platform e-ReColNat
https://www.recolnat.org/fr/—that gather images for all natural
history collections from France (and see Note 2). In the United
States, the National Science Foundation (NSF), through its
Advancing Digitization of Biological Collections (ADBC) pro-
gram, developed a strategic plan for a 10-year coordinated effort
to digitize and mobilize images and data associated with all
biological research collections of the country in a freely available
online platform. This will ensure increased accessibility of all valu-
able information and is being made possible by the establishment of
a central National Resource for Digitization of Biological Collec-
tions (called iDigBio for ‘Integrated Digitized Biocollections’;
https://www.idigbio.org/).
For a better diffusion of taxonomic revisions, Godfray [66]
claimed the need for a ‘unitary web-based and modernized taxon-
omy’ (see also [87]). Without opting for such a drastic evolution, a
revision of the International Code of Nomenclature (ICN) has
nevertheless encouraged a change dynamics toward electronic pub-
lications: at the International Botanical Congress held in Mel-
bourne in July 2011, purely electronic descriptions were judged
valid for the publication of new species (Art. 29), as opposed to the
previous requirement to publish in traditional, printed publication
[88]. But based on the following 8 years, it must be concluded that
Plant Taxonomy History and Prospects 21

the new applicable rule did not accelerate the rate of plant species
description or participation in biodiversity discovery as was hoped
[89]. Also, whereas the current taxonomic knowledge is mostly
made available in paper format as monographs, floras, and field
guides, many internet taxonomy initiatives exist and catalogue
species names, lists of museums specimens, and identification keys
and/or other biological information. These websites include, e.g.,
IPNI (www.ipni.org), The Plant List (www.theplantlist.org), GBIF
(www.gbif.org), Species 2000/ITIS Catalogue of Life (www.cata
logueoflife.org), Tree of Life (www.tolweb.org), and Encyclopedia
of Life (www.eol.org), to cite only a few (see [55, 66]).

4.3 Molecular In addition to increased efforts towards exploration in the field,


Taxonomy various initiatives to promote and develop taxonomic expertise,
and the Need for an generalization of collaborative work, and improved access to natu-
Accelerated Pace ral history collections and literature, major advances in technology
of Species Discovery also provide new opportunities to facilitate and accelerate the rate
of species discovery at a time of increasing need to monitor and
manage biodiversity. The goal of accelerating the pace of species
discovery was made especially clear by the promoters of the DNA
barcode initiative [90, 91], but more generally, the use of molecular
tools for taxonomic purpose emerged in the 1990s—or even in the
1970s if considering allozyme markers—and has quickly become an
area of intense activity.
Today, most recognized species have been delineated and
described based on morphological evidence: in general, they have
been delimited based on one or more qualitative or quantitative
morphological characters that show no—or very little—overlap
with other species [92]. The initial enthusiasm for molecular tax-
onomy most probably came from the additional and complemen-
tary information that it provided. Also, molecular taxonomy
requires an expertise that is nowadays more broadly distributed
than that for thorough morphological investigations, it makes use
of tools that are not specific to a particular group of plants, and it
may appear more prone to scientific publications in peer-reviewed
journals than more traditional, taxonomic studies. We synthesize,
here, several other characteristics—both strengths and limita-
tions—of molecular taxonomy that one should keep in mind
when initiating taxonomic studies using molecular tools.

4.3.1 Strengths First, it must be noted that the resemblance criterion within a
and Limitations species, on which is based the morphological approach to delimit
of Molecular Taxonomy species, suffers exceptions and can lead to erroneous conclusions.
Before the various reproductive systems of plants were well under-
stood, male and female individuals from a single—e.g., dioecious—
species were sometimes described as two distinct species based on
morphological investigations. For example, in the orchid genus
Catasetum Rich. ex Kunth plants are functionally dioecious (i.e.,
22 Germinal Rouhan and Myriam Gaudeul

with female and male flowers situated on distinct individuals) and


can morphologically differ so much from each other that taxono-
mists of the nineteenth century assigned individuals of the same
species to different genera (Monachanthus Lindl. and Myanthus
Lindl.) [93]. Other species descriptions incorporated characters
that were in fact due to anther-smut disease caused by the fungus
Microbotryum violaceum (Pers.) G. Deml & Oberw.: anthers of
infected plants are filled with dark-violet fungal spores instead of
yellow pollen [94]. As a result, Silene cardiopetala Franch., for
example, was distinguished from Silene tatarinowii Regel by its
dark anthers but should likely be treated as the same species.
More generally, because the phenotype of a plant is influenced
both by its genotype but also by its environment—and the interac-
tion between the genotype and the environment, called phenotypic
plasticity—the observations of herbarium specimens collected in
the field may be somewhat misleading. Molecular taxonomy should
avoid this possible bias since it is based on neutral markers that are
in principle independent of environmental conditions. However,
the influence of the environment is mostly true for vegetative
characters and usually less problematic for reproductive characters.
In addition, the use of several morphological characters should
limit the problem since all traits are unlikely to be affected in the
same way [95].
Second, several studies showed that, in comparison to the
traditional morphological criterion for delimiting species, molecu-
lar tools sometimes allow the detection of additional, so-called
‘cryptic’ species that could not be distinguished on morphological
grounds only. This may happen when species emerged in the recent
past, due to morphological stasis, or to morphological conver-
gences [96]. The existence of such cryptic species was reported,
e.g., on temperate or tropical plants ([97, 98] and references
therein; for an animal example, see [99]).
Third, in addition to the primary goal of species delimitation,
the use of genetic tools may allow to better understand the evolu-
tionary process at work within taxonomically complex groups,
where taxa are sometimes difficult—or even impossible—to delin-
eate. These groups are often characterized by uniparental repro-
duction, e.g., self-fertilization or apomixis, and reticulate
evolution, due to, e.g., hybridization and introgression, which
preclude the delineation of discrete and unambiguous taxonomic
entities. In such cases—e.g., in the genera Sorbus, Epipactis, and
Taraxacum ( [100–102] respectively, cited in [103])—principles of
conservation biology suggest that the evolutionary processes that
generate and maintain diversity should themselves be preserved
because they are even more important than the presently observed
taxa [103, 104]. In this perspective, molecular tools can yield very
useful information, usually based on a population sampling.
Plant Taxonomy History and Prospects 23

Fourth, from a practical point of view, a key strength of molec-


ular taxonomy is that it can be performed on any life stage—even
some that bear no or only few morphological characters such as
seeds, seedlings, or fern gametophytes [105, 106]—and almost any
type of material, e.g., leaves, cambium [107–110], bark [111], dry
wood [112], and roots [113, 114]. Therefore, the use of molecular
characters for taxonomic purposes appears especially suitable for
organisms that require years before flowering and/or fully devel-
oping, or when access to some other key—e.g., reproductive—
characters is difficult.
The ubiquitous character of the DNA molecule in living beings
can also become a problem, and care should be taken to only isolate
DNA from the target material and exclude DNAs of any other
animal, vegetal, or fungal organisms living around or in the plant
under study—e.g., parasitic insects, epiphyllous mosses, and endo-
phytic fungi.
Another practical limitation of molecular taxonomy is the cost,
as molecular lab facilities and often rather expensive consumables
are needed. This cost may be especially limiting in developing
countries [115, 116], although it is ever decreasing thanks to the
spread of molecular analyses, which are more and more commonly
employed, and to technological advances that allow cheaper and
less time-consuming analyses, see below.
An important parameter that is shared by ‘traditional’ and
molecular taxonomy studies is sampling strategy and sampling
effort. Taxonomy is based on a comparative approach that requires
the investigation of as many specimens/samples as possible in order
to catch all the extent of natural variation. Therefore, the quality of
taxonomic studies partly relies on a thorough sampling of speci-
mens/samples to be surveyed, and a biased sampling may cause
erroneous conclusions. As an example, Marsilea azorica Launert &
Paiva, which was thought to be a local endemic and critically
endangered species of the Azores archipelago, was recently shown
to be conspecific to an Australian native species that is widely
cultivated and invasive in Florida, Marsilea hirsuta R.Br [117]:
because the spread of M. hirsuta out of Australia was not docu-
mented when the Marsilea specimens from the Azores were exam-
ined by Launert and Paiva in 1983 [118], the botanists did not
include the Australian taxa into their survey, and erroneously
described the species as new to science.
On more theoretical and conceptual grounds, some claim that,
in comparison with ‘traditional’—typically morphology-based—
taxonomy, the use of molecular tools may avoid bias due to the
subjectivity of a given taxonomist, who could have a priori ideas on
species delimitation. However, the acquisition of a molecular
dataset also implies some more or less subjective choices, e.g., on
the distinction of orthologs vs. paralogs, on defining character
24 Germinal Rouhan and Myriam Gaudeul

homology when sequences of different lengths must be aligned to


form a square matrix, or on the statistical analysis to carry out after
the data are produced (see e.g. [95, 119–121]); the latter choice, on
data analysis, is closely related to the adoption of a given species
concept (see Note 1). Also, because our current technological
capacities do not allow the routine inclusion of the whole genome
in taxonomic analyses, choices must be made on the genomic
compartment(s) to survey—nuclear, mitochondrial, or chloroplas-
tic, the molecular technique(s) to use, and the precise, individual
marker(s) to consider (Chapter 2). The choice of a limited number
of markers is required, in practice, although multiple independent
loci might often be necessary to solve the possible disagreement
between gene trees and species trees, and to uncover the common
reticulate evolution—due to horizontal DNA transfer, hybridiza-
tion, and polyploidization events—and incomplete lineage sorting
in plants [55, 95, 121–123]. The extent of genome coverage by
molecular markers is partly dependent on the molecular technique
that is used, and there is often a trade-off between the possibility—
due to time and cost limitations—of surveying numerous markers
and the information content provided by each marker. For exam-
ple, it is usually achievable to include a large number of anonymous
markers based on length polymorphism—such as RAPD or AFLP
markers—but the number of DNA regions that could be
sequenced, representing highly informative data, is much more
limited with the traditional Sanger method. However, rapid
advances in Next-Generation Sequencing (NGS) technologies
have resulted in huge cost reduction and offer incredible new
opportunities for producing billions of base pairs of accurate
DNA sequence data in a few hours [124–126]. Studying whole
chloroplast genomes or multiple nuclear loci might therefore
become routine even in non-model species, and begin, obviously,
to revolutionize plant molecular taxonomy [127]. Then, the main
bottleneck is probably cleaning up and assembling the sequence
reads to generate useable data, and major improvements in bioin-
formatics would be needed to deal with such huge amounts of data
[124, 125].
Another limitation of molecular taxonomy is the possible lack
of genetic divergence when sister-species have very recent origins
because they will share alleles due to recent ancestry and, if repro-
ductive isolation is not complete, to ongoing gene flow, i.e., hybri-
dization. This lack of genetic variation can nevertheless be
accompanied by some level of morphological differentiation, lead-
ing to the exact symmetrical situation to cryptic taxa—where one
could observe genetic but no morphological distinction; see above.
The absence or extremely weak genetic divergence was observed,
e.g., in the young and species-rich neotropical genus Inga Mill.
(Fabaceae) [128], and striking examples of such morphological
Plant Taxonomy History and Prospects 25

diversification but weak genetic variation are also provided by cases


of adaptive radiations, where species rapidly adapt to different
environments—e.g., in the Hawaiian silverswords (Asteraceae)
[129], the Asian genus Rheum L. [130], or the widespread colum-
bine genus Aquilegia L. [131]; for more examples, see [132]. In
such cases of recent diversification, the delimitation of species will
usually be based on allele frequency changes rather than diagnostic
changes [133] and can benefit from recently developed coalescent-
based methods (e.g., [134, 135]). The time required for genetic
divergence to build up after speciation will depend on the mutation
and fixation rates—and the fixation rate depends on the number of
reproductively effective individuals. Because of different fixation
rates between (diploid) nuclear and (haploid) organelle genomes,
studies based on nuclear vs. organelle DNA markers may yield
contrasted results on species limits. Such contrasted results are
also made likely by the horizontal organelle DNA transfers that
occasionally occur, especially among closely related species.
Molecular markers can also suffer from homoplasy, i.e., markers
can show similar character states that, however, do not derive from
a common ancestor. In this case, they do not inform on the geneal-
ogy of taxa and, because they do not reflect a shared evolutionary
history, they may be misleading on evolutionary and, as a conse-
quence, on taxonomic relationships. This is especially problematic
for highly variable markers, e.g., microsatellites, and for DNA
sequences that are only composed of four types of monomer
(A, C, G, and T): as a result, a substitution at any one position
has a high probability of being a reversal or a convergence, i.e., of
being homoplasic [95, 121]. It is therefore critical to take this
caveat into account when analyzing and interpreting
molecular data.
Another drawback of molecular taxonomy is that name-bearing
type specimens often do not permit DNA analyses because of
nonoptimal drying and storage conditions, resulting in DNA dete-
rioration ( [136]; the same limit also obviously applies to most plant
fossils, which do not contain DNA). Consequently, a comparison of
the supposedly new species with known species may not be possible
on a molecular basis and prevent a rigorous taxonomic, compara-
tive approach. As part of their ‘plea for DNA taxonomy,’ Tautz
et al. [133] proposed to identify neotypes for all known species in
cases of unavailable genetic information from the original types, so
that these neotypes could constitute new reference records for
further studies. However, this proposal received very limited sup-
port (see, e.g., [119, 137]). Besides, recent progresses have been
made in the extraction of DNA from herbarium specimens
[138, 139] and the genetic analysis of such material will very likely
benefit from NGS technologies [126]. But so far, given the usually
low-quantity and often degraded DNA that is extracted from
26 Germinal Rouhan and Myriam Gaudeul

herbarium specimens, the most commonly employed molecular


techniques are microsatellite markers—because their short length
makes amplification more likely than that of longer DNA stretches
(see e.g. [140]), and organelle DNA sequencing, because their
multiple copies per cell represent more abundant template DNA
for PCR than nuclear loci. Most of the published studies report the
successful exploitation of specimens up to ca. 100–150 years old,
with DNA sequences produced usually ca. 500 pb long [141–
145]. But the kind of material—e.g., presence of PCR-inhibiting
substances [146]—and the speed and method of drying appear
more important than the actual age of the sample [141–143,
147], and some botanists managed to obtain DNA sequences
from even older specimens and/or longer DNA regions, e.g.,
Ames & Spooner sequenced ca. 440-bp DNA fragments from
potato material from the early eighteenth century [148], and
Andreasen et al. [147] sequenced 800-bp DNA fragments from a
specimen collected in the late eighteenth century. The successful
use of aged seeds has also been reported [138, 149].

4.3.2 The Definitive Need The use of molecular data in plant taxonomy has been era-splitting
for an Integrative and highly successful in many instances, but we also highlighted
Taxonomy some limits and cautions to consider when adopting this approach.
Most importantly, a species description solely based on molecular
evidence would obviously seem critically disconnected from the
natural history of the species, i.e., its life-history traits, ecological
requirements, co-occurring species, biotic interactions, etc. As
such, molecular tools may indeed accelerate the rate of species
discovery but would actually be a poor contribution to our knowl-
edge and understanding of plant diversity and evolution. Such a use
of molecular taxonomy could even end up with the exact opposite
of the expected outcome if funders only aim to basically delineate
and count species with no other ambition; indeed, gathering fur-
ther biological information is an essential prerequisite to make a
general use of the taxonomic knowledge, efficiently preserve the
existing diversity, and allow its continued evolution. Botanists have
long realized this and promoted the use of multiple independent
sources of data, and/or the use of several analytical methods on the
same dataset to corroborate the delimitation and provide a thor-
ough and detailed description of species. As early as 1961, Simpson
(p. 71) wrote ‘It is an axiom of modem taxonomy that the variety of
data should be pushed as far as possible to the limits of practicabil-
ity’ [6]. In agreement, Alves and Machado [150] wrote that ‘Tax-
onomy should be based on all available evidence.’ This awareness
gave rise to the advent of what is now called ‘integrative taxonomy,’
where taxonomic hypotheses are cross-validated by several lines of
evidence ( [29, 121, 150–155], and many others). As sources of
relevant characters, many fields of biology might contribute to
Plant Taxonomy History and Prospects 27

taxonomic studies: they include morpho-anatomy which takes


advantage of new techniques such as Scanning Electron Micros-
copy (SEM), remotely operable digital microscopy, computer-
assisted tomography, confocal laser microscopy, and automatic
image processing for morphometry [155, 156], cytometry and
cytogenetics (see Chapters 17–19) but also palynology, physiology,
chemistry (production of secondary compounds), breeding rela-
tionships, and ecological niche modelling—we are not aware of
currently available examples in plant taxonomy but for animals, see
[157–159]. Other sources of information will also most probably
be more widely used in the future, such as transcriptomics
[160, 161], metabolomics [162], proteomics [95], and even phe-
nomics—Munck et al. [163] showed, in barley, that the fingerprint
of a near-infrared spectrum from an individual represents a coarse-
grained overview of the whole physiochemical composition of its
phenome, with the phenomic profile resulting from the combined
effects of the entire genome, proteome, and metabolome
[164]. The diversity of approaches involved in modern plant tax-
onomy is consistent with the observations by Joppa et al. [80] that
(a) today’s biologists who describe species are not only contribut-
ing to the field of taxonomy, but also active in other fields/disci-
plines, and (b) most new species are nowadays described by several
authors whereas descriptions by a single author were common
around 1900.
It is also clear that end users of taxonomy such as conservation
planners need an operational, character-based, and cheap way to
discriminate species [91, 115, 150]. This could tend to diminish
the perceived potential of molecular taxonomy, but in this perspec-
tive and in spite of the shortcomings that we have just underlined,
molecular taxonomy obviously has a great role to play. DNA can aid
to delimit taxa, and to group specimens among which to find
morphological—or other types of—affinities in further investiga-
tions (see, e.g., [165, 166]). Such clusters of individuals, character-
ized by close genetic relationships, are sometimes referred to as
‘molecular operational taxonomic units’ (MOTU) [167], before
their genuine taxonomic statuses are evaluated by gathering addi-
tional data. Markmann and Tautz [168] called this approach, based
on an initial molecular assessment, the ‘reverse taxonomy’ (see also
[151, 152]). The fruitful link between ‘traditional’ and molecular
taxonomy should be accompanied by an analogous link between
herbarium vouchers, plant samples for DNA extraction, and DNA
extracts [169, 170]. The curation of such collections and the
maintenance of a dynamic link between them will provide a long-
lasting and reliable framework for taxonomic investigations, and
will permit the critical re-evaluation of taxa delimitations at any
time, based on both herbarium and DNA material.
28 Germinal Rouhan and Myriam Gaudeul

Duminil and Di Michele [95] reviewed studies comparing


species delimitations based on morphological traits and molecular
markers. They found both cases of congruence and incongruence
between the two types of data. As suggested above, cases of incon-
gruence were either due to stronger molecular discrimination
between species—suggesting the existence of cryptic species—or,
on the contrary, to stronger morphological differentiation, due to
processes like local adaptation, phenotypic plasticity, or neutral
morphological polymorphism (e.g., [171]). Conflicting results
often trigger more in-depth studies using as many loci as possible
and, if possible, loci that originate from different genomes, with the
goal to better understand the patterns and processes of plant evo-
lution and diversification (e.g. [172–178]).
Taxonomic circumscriptions are scientific hypotheses, which
are ideally validated by evidence from multiple sources, and molec-
ular methods offer the opportunity to yield high-potential infor-
mation. However, there is not a single, best method to be used in all
plant groups and the molecular taxonomist will have to face multi-
ple questions: before anything, it is necessary to identify the optimal
sampling strategy, the most suited genomic compartment(s) to
examine, the right technique(s) to use, and the adequate method
(s) of statistical analysis to extract the relevant information about
species limits and relationships [120, 121]. In addition to the
complementarity of ‘traditional’ and genetic approaches, molecular
taxonomy itself will often require to gather and compare patterns
based on several types of data—e.g., nuclear vs. cytoplasmic mar-
kers or markers with different rates of evolution. The goal of this
book is to present the possible alternatives of molecular taxonomy,
their practical implications in the lab, current analytical tools that
are available, and theoretical consequences for data interpretation.
The empirical and analytical approaches used for a molecular taxo-
nomic study, together with the conclusions drawn from the data,
will also obviously depend on the species concept that is adopted
and on the choice of operational criteria to delimit species—see
Note 1).

5 Notes

1. Species concepts and contemporary criteria for species


delimitation.
Species delimitation obviously depends on what a species is
and, although the species is often seen as the fundamental unit
of evolution, its definition has long remained highly debated.
The existence of species itself is somewhat controversial,
especially in plants where asexuality, hybridization, and poly-
ploidy may render the definition and delimitation of species
Plant Taxonomy History and Prospects 29

complex and fuzzy. Some argue that species are ‘arbitrary con-
structs of the human mind’ while others claim that they are
objective, discrete entities. Reviewing the available data (both
in plants and animals), Rieseberg et al. [179] showed that
discrete phenotypic clusters exist in most genera (>80%),
although the correspondence of taxonomic species to these
clusters is poor (<60% and not different between plants and
animals). In addition, crossability experiments indicate that as
much as 70% of plant taxonomic species and 75% of plant
phenotypic clusters correspond to reproductively independent
lineages.
The proliferation of alternative species concepts really
started in the 1970s. It gave rise to several decades of debate
and taxonomic instability because many concepts were incom-
patible in that they lead to the recognition of different species
boundaries and different number of species. This was called the
‘species problem.’
Morphological approaches have dominated species delimi-
tation for centuries, starting with the purely typological (i.e.,
essentialist) pre-Darwinian view. But most contemporary biol-
ogists are familiar with the idea that species are groups of
actually or potentially interbreeding natural populations,
which are reproductively isolated from other such groups (the
‘Biological Species Concept’), whether or not they differ in
phenotypic characters that are readily apparent.
However, another, unified species concept has now
emerged. It originated as early as the beginning of the twenti-
eth century (with, e.g., E. B. Poulton), became well established
during the period of the Modern Evolutionary Synthesis (with
the great leaders T. Dobzhansky, E. Mayr, G. G. Simpson, and
S. Wright), and was recently largely promoted by de Queiroz
[180, 181]. This unified concept reconciles previous, at least
partially incompatible species concepts. It considers species as
separately evolving metapopulation lineages and is called the
‘General (metapopulation) Lineage Concept.’ Other proper-
ties of species, which used to be treated as necessary (and
sufficient) properties to recognize a species as such (e.g., repro-
ductive isolation, monophyly; see Table 1), are now only seen as
different lines of evidence, or ‘operational criteria,’ relevant to
assessing lineage separation. The unified species concept is
actually not a new concept, but simply the clear separation of
the theoretical concept from the operational criteria that are
used for the empirical application of this concept.
Operational criteria can be either tree-based or non-tree
based (e.g., direct tests of crossability, indirect estimates of
gene flow, statistical clustering algorithms) [198], and new
methods are still being developed (e.g., analyzing multilocus
genetic data in a coalescent framework). Criteria differ in their
30 Germinal Rouhan and Myriam Gaudeul

Table 1
Some alternative contemporary species concepts/criteria

Name of the species concept/ Major


criterion Definition of the species contributor(s) Ref.
Interbreeding species concept A group of potentially interbreeding Wright 1940, [182–184]
[forms the basis for the general populations Mayr 1942,
(metapopulation) lineage Dobzhansky
concept] 1950
a
Isolation species concept [often A group of potentially interbreeding Poulton 1904, [184–186]
called the biological species populations that is reproductively Mayr 1942,
concept] isolated from other such groups Dobzhansky
1970
Phenetic species concept A group that forms a phenetic cluster Sokal and [187]
(quantitative difference) Crovello
1970
Ecological species concept A group that shares the same niche or Van Vaalen [188]
adaptive zone 1976
a
Evolutionary species concept A lineage (i.e., an ancestral- Simpson 1951, [189, 190]
[corresponds closely to the descendant sequence of Wiley 1978
general (metapopulation) populations) evolving separately
lineage concept] from others and with its own
evolutionary role and tendencies
Phylogenetic species concept— An irreducible (basal) cluster of Cracraft 1989 [191]
character diagnosability version organisms, diagnosably distinct
from other such clusters, and
within which there is a parental
pattern of ancestry and descent
(fixed qualitative character)
The diagnostic character can be from
any trait (e.g., morphological or
molecular) and of any significance
(e.g., a single base pair)
Phylogenetic species concept— A group that shows monophyly Rosen 1979, [192–194]
reciprocal monophyly version (consisting of an ancestor and all of Donoghue
its descendants, and commonly 1985,
inferred from the possession of Mishler 1985
shared derived character states)
Genealogical species concept A group that shows monophyly for Baum and Shaw [195]
all (or at a consensus of) gene 1995
genealogies in the genome
Genotypic species concept A group recognizable on the basis of Mallet 1995 [196]
multiple, unlinked, inherited
genetic markers
A pair of such genotypic clusters is
recognizable if the frequency
distribution of genotypes is
bimodal or multimodal, and

(continued)
Plant Taxonomy History and Prospects 31

Table 1
(continued)

Name of the species concept/ Major


criterion Definition of the species contributor(s) Ref.
strong heterozygote deficits and
linkage disequilibria are evident
between the clusters
a
Cohesion species concept A group that is characterized by Templeton [197]
cohesion mechanisms, including 1998
reproductive isolation, recognition
mechanisms, and ecological
selection, as well as by genealogical
distinctiveness
a
Combined species concepts, i.e., concepts using a combination of morphological, ecological, phylogenetic, and repro-
ductive criteria

suitability to some particular species (e.g., sexual vs. asexual),


their requirements in terms of type of data and sampling, and
their strengths and limitations. It must also be noted that most
of them will require researchers to make some qualitative judg-
ments at some point.
The commonly observed incompatibility between various
criteria stems from the fact that various properties actually arise
at different stages in the process of speciation: as lineages
diverge, they become distinguishable in terms of quantitative
traits, diagnosable in terms of fixed character states, reproduc-
tively incompatible, they evolve distinct ecologies, they pass
through polyphyletic, paraphyletic, and monophyletic stages,
etc. These changes commonly do not occur at the same time,
and they are not even necessarily expected to occur in a specific
order. De Queiroz [180] qualifies this transition period, from
one ancestral species to two divergent species, a ‘grey zone,’
where alternative species definitions can come into conflict. But
as lineages diverge, the number of species criteria satisfied will
increase and allow a highly corroborated hypothesis of lineage
separation and species delimitation.
2. A new world-class research Infrastructure is now being built in
Europe, the Distributed System of Scientific Collections (DiS-
SCo), that will work for the digital unification of all natural
science assets under common curation, access, policies and
practices, and that aims to ensure that the data is easily Find-
able, Accessible, Interoperable and Reusable.

You might also like