35
TAXONOMY, METHODS OF R. I. Vane-Wright The Natural History Museum London I The Tasks of Taxonomy diagnosis A set of attributes suf cient to de ne, charac- terize, II . Building Blocks: Individuals and Characters II I. Special and General Classi cations distance A measure of dissimilarity between two taxa IV . Differing Philosophies and Methods of Tax- based (normally) on the number of mismatches onomy within a large set of characters or attributes com- V. General Procedures pared for both taxa. VI . From System to Classi cation gap A large or relatively large difference in overall simi- VII . Current Practice: Variations on a Cladistic larity between two taxa. Theme grade A group based on, for example, the functional VII I. Conclusions level of organization or overall similarity rather than ancestor–descendant relationships. GLOSSARY clade A complete ancestor–descendant lineage (cf. mo- nophyletic group). cladistics Production of taxonomic system based on hierarchical patterns of homologous characters, ex- pressed as cladograms. cladogram A dendrogram expressing estimated cladis- tic relationships among taxa; a cladogram has no direct connotation of ancestry and the long axis does not connote time. dendrogram A branching, nonreticulate diagram ex- pressing nested hierarchical relationships or similari- ties (or both) between entities (e.g., taxa) such that the entities only appear at the tips of terminal branches.

(260818776) M�todos de taxonom�a

Embed Size (px)

Citation preview

Page 1: (260818776) M�todos de taxonom�a

TAXONOMY, METHODS OFR. I. Vane-WrightThe Natural History Museum London

I. The Tasks of Taxonomy diagnosis A set of attributes sufficient to define, charac- terize, or identify a given taxonomic group.II. Building Blocks: Individuals and Characters

III. Special and General Classifications distance A measure of dissimilarity between two taxaIV. Differing Philosophies and Methods of Tax- based (normally) on the number of mismatches

onomy within a large set of characters or attributes com-V. General Procedures pared for both taxa.

VI. From System to Classification gap A large or relatively large difference in overall simi-VII. Current Practice: Variations on a Cladistic larity between two taxa.

Theme grade A group based on, for example, the functionalVIII. Conclusions level of organization or overall similarity rather than

ancestor–descendant relationships.

GLOSSARY

clade A complete ancestor–descendant lineage (cf. mo- nophyletic group).

cladistics Production of taxonomic system based on hierarchical patterns of homologous characters, ex- pressed as cladograms.

cladogram A dendrogram expressing estimated cladis- tic relationships among taxa; a cladogram has no direct connotation of ancestry and the long axis does not connote time.

dendrogram A branching, nonreticulate diagram ex- pressing nested hierarchical relationships or similari- ties (or both) between entities (e.g., taxa) such that the entities only appear at the tips of terminal branches.

ground plan (archetype, bauplan) A basic plan or gen-eral type, or a hypothetical ancestor.

jizz Characteristic, instantly recognizable appearance of an organism.

monophyletic group A group comprising a given (hy- pothetical) ancestor and all its descendants; in a restricted sense, within a given set of terminal taxa, all the members of a subset arising from a common ancestor that has given rise to no other member(s) of the whole set.

paraphyletic group A group comprising a given (hypo- thetical) ancestor and only some of its descendants.

phenetics Production of a taxonomic system based onoverall similarity.

phenogram A dendrogram expressing overall similar- ity (long axis) between terminal taxa and sequen- tially linked groups of terminal taxa; the nodes do not connote specifiable characters and the long axis does not connote time.

Encyclopedia of Biodiversity, Volume 5Copyright 2001 by Academic Press. All rights of reproduction in any form reserved. 589

Page 2: (260818776) M�todos de taxonom�a

590 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

polyphyletic group A group that does not include the most recent common ancestor of all of its members.

polythetic group A natural taxonomic group in which the terminal taxa are not known to share any univer-sal unique character(s) but are nonetheless united by overall similarity (phenetics) or global parsi- mony (cladistics).

rank A specified categorical level in the taxonomic hi- erarchy (e.g., species, genus, family, and class) made coordinate in classification by definition but very frequently not coordinate in the taxonomic system.

similarity A measure of coincidence (matches) among a (large) set of characters compared for any two taxa.

sister groups Two terminal taxa or monophyletic groups that share a common ancestor that has notgiven rise to any other taxon under consideration.

taxon Any formally named or recognizable group in a taxonomic system (e.g., order, family, and genus), including all the particular terminal taxa (species).

nents (taxa) that can be used to refer to all known living and extinct organisms.

Although effective taxonomy predates Darwinism, systematics is now closely linked to the theory of evolu- tion. Descent with modification gives justification to the general form of classification adopted: hierarchical, nonoverlapping sets rather than fuzzy sets or periodic tables. The fundamental methods of taxonomic system- atization are few. Ultimately, there may prove to be only two—overall similarity methods (grouping by degree of genetic or phenetic similarity) and hierarchical methods (grouping by inclusive phylogenetic or cladistic rela- tionship)—but the theoretical bases of taxonomy are various. This has led to a proliferation of approaches, often with variants. To understand alternative methods of taxonomy it is necessary to appreciate the philosophi- cal differences between these approaches, and much of this article is devoted to exploring these differences rather than describing techniques. The methods of cla-

distics, currently the dominant approach, are the sub- ject of a separate review.

Taxonomy is a highly controversial subject, and the issues are inextricably bound up with philo- sophical disputes which have endured for centu- ries. The problems are so important that no biolo- gist can totally avoid facing them.—Michael T. Ghiselin (The Triumph of the Darwinian Method,1969, p. 79)

No one would think much of a chemist who con- fused water and benzene ‘‘because they look alike.’’—Arthur J. Cain (Animal Species and Their Evolution, 1954, p. 11)

TAXONOMY plays the central role in the scientific dis- cipline of systematics and, according to some authors, the two are largely or even entirely synonymous. Sys- tematics makes an essential contribution to our under- standing of biological diversity, including its origins, distribution, and maintenance. The primary task of tax- onomy is systematization: to establish and give an ac- count of biological order among the diversity of organ- isms. This involves enumerating the kinds of living things that exist and have existed in the past and de- termining the patterns of difference and connection among them. By giving expression to these patterns through naming inclusive sets, subordinate sets and least included entities (e.g., higher classes, genera, spe- cies, subspecies, and varieties), taxonomists have pro- duced the general biological classification: a categorical arrangement of named, diagnosable groups and compo-

I. THE TASKS OF TAXONOMY

Taxonomists perform five main functions: discrimina- tion (discovery or primary recognition of taxa, also entailing formal description and diagnosis), comparison (assessing similarities, differences, and relationships among taxa), classification (production of summary schemes that encapsulate current knowledge of taxa and their main interrelationships), symbolization (ap- plication of names to taxa and classes: technical nomen- clature), and identification (secondary recognition of taxa: matching unidentified material to the estab- lished system).

This article focuses on different approaches to pri- mary discrimination and comparison of taxa (systemati- zation) and the ways in which this knowledge can be expressed as summary schemes (classification). The procedures of taxonomy as an information retrieval sys- tem, which includes making checklists, catalogs, and databases, identification (naming specimens), and tech- nical nomenclature (application and regulation of names: naming taxa), are not dealt with here. Nor, explicitly, are the means of gathering taxonomic data, e.g., from anatomy, karyology, biochemistry, and nu- cleic acid chemistry, and their corresponding subdisci- plines (morphotaxonomy, cytotaxonomy, chemotaxon- omy, molecular systematics, etc.). Although such subdisciplines are often described as taxonomic meth- ods, in the wider context of systematics and taxonomy

Page 3: (260818776) M�todos de taxonom�a

591 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

as a whole they are subsidiary techniques. Much the same methods and problems of interpretation apply, whatever the source of empirical data. This also applies to experimental taxonomy (the investigation of taxa based on predictions from alternative systems in an attempt to gather data bearing directly on particular taxonomic problems) and biometrics (quantitative comparison of related taxa).

Disagreements regarding taxonomic methods can lead to major differences in classification. Such discrep- ancies range from disputes over the validity of species, subspecies, and infrasubspecifics at one end of the scale to the extreme opposite where composition of even the most inclusive categories (domains and their compo- nent kingdoms) remains uncertain. By classifying the classifiers it may be possible to identify some of the fundamental reasons, but not all taxonomic disagree- ments are due to method alone because historical prece- dent and subjectivity still intervene. Moreover, in prac- tice a great deal of constructive taxonomic work is done with little reference to philosophy or explicit method, being achieved by pragmatic intervention, notably the extension or modification of existing parts of the system (e.g., by description of new species and establishment of synonymy).

II. BUILDING BLOCKS: INDIVIDUALS AND CHARACTERS

Taxonomy is an empirical activity. On the basis of char- acters derived from sensory data, individual organisms or life cycles can be discriminated and then divided among or gathered into groups, and groups within groups, and so on to produce a taxonomic system. This can be done either top-down (divisive methods, as in Linnean divisions) or bottom-up (agglomerative meth- ods, as in most numerical taxonomy). Although individ- ual organisms and characters are the basic elements in this process, their definitions are not straightforward.

A. Individual OrganismsOrganisms multiply, typically from spores or fertilized eggs, to reproduce the corresponding parental stage through a cycle of growth and differentiation. At any point along such an ontogenetic pathway, a particular organism can be referred to as an individual: fertilized egg, embryo, larva, subadult, adult, or postreproductive adult. However, some organisms multiply by splitting or propagation, rendering the parent–offspring distinc-

tion uncertain. Thus, land plants can spread by ‘‘run- ners’’ so that what appears to be a field of separate individuals can be identical clones of one original plant. In the case of social organisms (e.g., ants and termites) in which very large numbers of workers, guards, or other castes may be produced that never have an oppor- tunity to reproduce, the entire colony has some of the properties of an individual. These distinctions are im- portant because the reliability of the taxonomic process, other factors being equal, is dependent on the number of individual organisms sampled as well as the number of characters recognized and recorded. Multiple sam- pling from essentially the same individual organism can distort our views of natural variation or lack of it and lead to erroneous conclusions (Wiens, 1999).

B. Attributes and CharactersTaxonomists compare organisms by means of charac- ters. Characters are abstractions derived from the de- tectable attributes of individual organisms or social groups (e.g., ‘‘large, two pointed prongs on head, color vision, always herd with tails erect’’). To be informative, it is obvious that characters observed must not be uni- versal throughout the organisms under investigation. The distribution of characters is the primary interest, and the degree to which they differentiate, coincide, and conflict will largely determine their usefulness for taxonomy.

In general, characters can be divided into two sorts: continuous (e.g., height, from tall to short) and discrete (e.g., paired horns versus no horns). However, this is not absolute but more a matter of scale. Thus, different individuals of a particular kind of fly might bear every possible combination of 1–20 spines on the thorax: Is this continuous or discrete variation? In addition, cod- ing can be arbitrary (how tall is ‘‘tall’’?), and subdivision can be ambiguous. If we observe different individuals with straight, curved, and forked horns, do we simply regard these as three separate characters, three states of one character (form of horn), or the intersection of two binary alternatives—horns straight/curved and horns simple/forked (with the expectation of being able to distinguish, in theory at least, two sorts of forked horns)? Moreover, there are other possibilities for cod- ing such variables. The decision we make can affect analysis (e.g., by encoding spurious information derived from inapplicable characters). Thus, if an animal lacks horns, and this is coded as three separate pieces of data (not forked, not curved, and not straight), such redundancy can have undesirable effects on analysis (Strong and Lipscomb, 1999).

Page 4: (260818776) M�todos de taxonom�a

592 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

In theory, unit characters must not only be nonre- dundant but also homologous and independent (Pimen- tel and Riggins, 1987). If different parts of the body are functionally interdependent or developmentally linked, it will be misleading to count these attributes as separate characters. This can occur when a single gene has a pleiotropic effect, influencing, for example, the color of one organ and the form of another. At the extreme, it is clear that multiple characters should not be created by logical correlation (e.g., treating both the circumfer- ence and the diameter of a circular organ as two unit characters). Some organs (e.g., parts of the male genita- lia of insects) may appear so complex that counting them as one character seems unreasonable, but any stopping rule for subdivision may be uncertain, whereas other characters that appear simple may prove to be complex (e.g., a single functional bone formed by fusion during development, representing several characters in comparison to related taxa in which the equivalent bones have not fused).

III. SPECIAL AND GENERAL CLASSIFICATIONS

On the basis of characters held in common, individual organisms can be grouped into a large number of classes, which are of two general kinds. On the one hand, individuals can be grouped in terms of a particu- lar attribute (e.g., green, round, four-legged, marine, planktonic, nocturnal, and pollinating) or by small combinations of attributes to give prescriptions such as‘‘plankton-feeding marine organisms’’ or ‘‘nonflowering, epiphytic semiparasitic plants.’’ Alternatively, they can be placed into categories of species, genera, families, orders, and so on. The former are regarded as artificial or special classes and are generally defined by reference to given attributes, whereas the latter are viewed, ide- ally, as natural kinds or natural groups which are dis- covered but cannot be defined a priori (although they can be diagnosed a posteriori).

Special classes frequently overlap, such as the overlay of plankton feeders, not all of which are marine, and marine organisms, not all of which feed on plankton, to define the special class discussed previously; also, the same organism may recur in many different special classifications (e.g., eagle as flying organism, predator, or nest builder and in the conjunction of all three). In contrast, natural groups typically form nested, nonover- lapping sets in which each kind of organism or group only appears once. Thus, within the general classifica-

tion of birds, all eagles are included as members of the family Accipitridae, which also includes vultures, buzzards, hawks, and kites.

A. Evolution, Genetic Relationships, and the Natural Hierarchy

Natural groups comprise individuals with many attri- butes in common, whereas individuals belonging to special classes have relatively few shared characters. In practice, the difference between, for example, ‘‘large marine animals’’ and Cetacea is simply that individuals of the latter natural class have far more in common than those of the former. Even so, the word ‘‘natural’’ has had many connotations in the context of taxonomy, including the essence of things classified (Aristotle), rationality (e.g., as in God’s design), similarity, and explanatory power (Gilmour, 1940). Biology, however, has a unique theory that underpins its totality: the the- ory of organic evolution.

Ideas about evolution can be divided into a general theory of descent with modification and special theories about the processes affecting that descent (e.g., ortho- genesis; natural, group, kin, sexual, and species selec- tion; adaptive radiation; molecular drive; and molecular clocks). Most systematists agree that the general theory of evolution not only provides a compelling justification for seeking one natural, general classification of living organisms but also suggests the basis on which this classification is most securely founded. A hierarchical pattern of exclusive sets and inclusive subsets can re- flect the primarily divergent sequence of ancestor– descendant relationships.

Hennig (1966) used the term ‘‘tokogenetic’’ relation- ships for the reticulate genealogical links that occur between parents and offspring within an evolving, sexu- ally reproducing population. When a whole population splits to form two or more divergent subsystems within each of which tokogenetic relationships are maintained but between which processes of genetic recombination are largely or entirely discontinued, speciation (or phy- logenesis) occurs. If we describe two extant taxa as phylogenetically more closely related to each other than to any third taxon (regarding them as sister taxa), we imply that the two do not share tokogenetic relations now but did so in the past within a common ancestor that they do not share with any other living taxon. Such ideas were basic to Hennig’s concept of phylogenetic systematics. However, many organisms reproduce with- out genetic recombination (sex) and thus lack tokogen- etic relationships. Moreover, the great variety of sexual processes suggests that tokogenetic relationships are

Page 5: (260818776) M�todos de taxonom�a

593 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

not only nonuniversal but also may differ fundamen- tally in the many lineages where they do occur (Mar- gulis and Sagan, 1984).

Species and other taxa grouped to reflect their phylo- genetic relationships are expected, by virtue of their historical connections, to share far more characters in common than members of artificially formed groups. This expectation extends to unknown attributes, mak- ing phylogenetic classifications highly predictive. How- ever, another type of genetical relationship is increas- ingly recognized as important in organic evolution: lateral gene transfer. The chimerical nature of lichens as fungal/algal symbionts has long been recognized. Following the work of Lynn Margulis, it is accepted that at least two of the cellular organelles found in all eukaryotes originated through symbiosis of fundamen- tally separate organisms. Evidence is accumulating from molecular phylogenetics that the genomes of many, perhaps all, major life-forms are chimeras formed from multiple original sources. The extent to which lateral gene transfer undermines current approaches to natural classification is uncertain, but at the domain level at least, a nonreticulate hierarchy based exclusively on divergence appears to be unrealistic (Doolittle, 1999).

IV. DIFFERING PHILOSOPHIES AND METHODS OF TAXONOMY

Taxonomic methods have developed over time. The massive edifice of taxonomic classification, involving millions of terminal and higher taxa, on which the study, scientific use, and conservation of biodiversity depends has been built up over centuries. This system has been produced by thousands of different minds using different methods, working with different knowl- edge, under different influences, and often seeking dif- ferent goals. Some parts have been revised and reworked repeatedly, others hardly at all. To understand the strengths, weaknesses, and limitations of the taxonomic system, it is necessary to appreciate the ways in which systematists of contrasting persuasions have sought or- der in nature and tried to reflect that order in biological classification. In the sections that follows, the various approaches and methods are reviewed in a historical se- quence.

A. Essentialism, Idealism, andPreevolutionary Taxonomy

In his early writings, it is clear that Linnaeus, the found- ing father of modern taxonomy, was trying to detect

the pattern of Creation in the classes and species that he recognized and named. Linnaeus’ system was, super- ficially at least, very simple. Having divided all life- forms into animals and plants, these kingdoms were then divided successively, on the basis of one or a few defining characters, into a series of smaller units down to the level of the genus. Within each genus a large number of terminal species were usually recognized, and each was given a short, diagnostic description. For each species, he also accepted that namable variations could occur but considered that these did not represent the fundamental plan of God’s work.

It can be argued that Linnaeus started out as an essentialist, belonging to an intellectual tradition founded on Aristotle’s methods of logical division (Lin- naeus later modified his views considerably). Aristotle was perhaps more concerned with the classification of our knowledge about living things, or even the genera- tion of knowledge through the process of classification, than he was with the classification of organisms as such. The young Linnaeus was trying to discover or reveal deep knowledge: the natural order of God’s Creation. Through his search for an external criterion of verity, Linnaeus’ concepts of species and higher taxa were fun- damentally removed from those of the nominalists (see Section IV.C).

Essentialism and related but distinct ideas of the early Greek philosophers, notably Plato’s idealism, con- tinue to have influences on both the practice and under- standing of taxonomy. According to the typological spe- cies concept, every species was thought to have its own idealized plan or design. The task of the taxonomist was then to recognize each of these theoretical designs, and describe, divine, or define the essential features of these ‘‘types’’ so that individual, real organisms could then be assigned to them. Thus, for a Platonist, ‘‘taxo- nomic names are the names, not of organisms, but of concepts’’ (Ghiselin, 1969). Even today, many taxono- mists consider given species as concepts (e.g., fulfilling some ideal as gene pools, potential interbreeding units, or mate recognition systems) rather than empirical entities.

Idealism also had a great influence on concepts re- garding major groups of organisms, especially animal phyla, which many morphologists believed could be formulated as a series of fundamentally different body plans (ground plans or Baupla¨ ne). The notion of the ground plan was strongly developed by the ideal mor- phologists, who evidently affected the developing ideas of Willi Hennig (see Section IV.F) in the 1930s and1940s, even though he later rejected many of their notions. According to Hull (1965), by supposedly rid-

Page 6: (260818776) M�todos de taxonom�a

594 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

ding taxonomy of its essentialist and idealist burdens, and moving on to an empirical and relativistic frame- work, Hennig and other modern systematists liberated the science of taxonomy from a philosophical bind that held back progress for more than 2000 years. Hull’s evaluation, however, may have been premature (see Section VIII).

B. Empirical and InductiveTaxonomic Methods

According to Mayr (1969), empiricism should be in- cluded as an atheoretical approach embracing the plau- sible, commonsense view that once sufficient knowl- edge has been gathered about organisms, a natural system of classification will simply emerge or become self-evident. Although it seems difficult to accord such a process the status of a general method, particular empirical observations can and often do render theoreti- cal disputes irrelevant.

When Vaughan Thompson discovered in 1829 that barnacles develop from a nauplius larva, they were readily transferred from the Mollusca (where they had often been placed) to the Crustacea. Although this change in classification could in retrospect be justified by appeal to arguments about evolution, overall similar- ity, bauplans, or synapomorphies, to the empiricist this would all appear unnecessary. To raise a brood of in- sects and find that all the males belong to one genus and all the females to another leads to an instant ap- preciation of generic synonymy, without any need for appeal to theory.

The most complete expression of empirical classifi- cation occurs outside science in the form of folk taxon- omy. Because folk taxonomy is well developed even in illiterate cultures, instead of inquiring into method or philosophy, it can only be understood by description and comparison. Berlin (1992) identified seven major features of ethnobiological categorization: recognition is given to the most distinctive local species; classifica- tion is based on ‘‘affinities that humans observe’’ (not on cultural significance); systems have a limited hierarchic structure; recognized taxa are distributed among a few mutually exclusive universal ranks, approximately equivalent to kingdom, life-form (e.g., plant), family, genus, species, and variety; in different ethnic systems, taxa at each rank show striking similarities regarding their number of subordinate taxa; taxa at generic and specific levels have an internal structure in which some included members are viewed as ‘‘prototypical’’ and oth- ers as reminiscent of other taxa at the same rank, giving rise to a type of fuzzy set classification; and a large

majority of the taxa differentiated, most notably at the generic level, correspond to groups recognized in for- mal taxonomy.

These regularities and correspondences are highly suggestive that there is a natural biological classification that is almost literally self-evident and independent of the observer, and that a significant number of natural elements and groupings can be recognized simply through extensive knowledge and contact with nature. Although the young Linnaeus thought he was dis- covering the handiwork of God, it seems possible he was also involved in formalizing preexisting European ethnobiological knowledge.

Thus, as empirical knowledge accumulates, a com- mon sense or consensus view of certain issues arises, and in some cases such a view may seem irrefutable. Even so, there are many problems in taxonomy that cannot be decided (e.g., questions of relative rank or the status of paraphyletic groups) without an explicit theoretical framework.

C. NominalismOne view of taxonomy is that, even though we may desire a general system, there are no independent means for assessment. The only objective realities are individ- ual organisms, and all taxonomic groups are man-made abstractions (‘‘categories of thought’’ according to Louis Aggasiz). Named groups are convenient for collective reference but have no independent basis separate from the human mind, simply being useful pigeonholes for dividing up or handling diversity. Many biologists have held similar views (apparently including Darwin, who once commented on species as a ‘‘term . . . arbitrarily given for the sake of convenience to a set of individuals closely resembling each other’’; see Mayr, 1963, p. 14; also see Ghiselin, 1969).

By embracing the idea that taxonomic groups are established for convenience, nominalism has links with special classifications. If special classifications such as‘‘trees’’ or ‘‘four-footed land vertebrates’’ are convenient, then there is no obvious reason why the nominalist should reject them, except insofar as it can be demon- strated that some comparable grouping has more heu- ristic or predictive power (e.g., trees extended to Tra- cheophyta and literal Tetrapoda extended to include snakes, birds, whales, and bats); in other words, they are in some sense more convenient. Convenient for who then becomes the question.

Berlin’s (1992) review suggests that folk taxonomies are not nominalist, which perhaps might have been expected, but seminatural and based on extensive em-

Page 7: (260818776) M�todos de taxonom�a

595 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

pirical knowledge. Folk systems often have terms for conspicuous natural groups such as Mammalia, but it is not clear to what extent paraphyletic or even polyphy- letic taxa are also included. Because folk taxonomies have terms for life-forms, which are in effect grades, this suggests that humans may possess an innate mixed strategy for classification, sometimes grouping on pal- pable characters, whereas at other times forming groups on the basis of general resemblance, Gestalt, or jizz.

Although Panchen (1992) suggests that there are no extant practitioners of nominalism, the view that spe- cies are real while all higher taxonomic groups are artificial is widely expressed (even though quite mis- guided: contrast the difficulty of even ‘‘defining’’ many species of mice with the ease with which the class Mammalia can be recognized by many consistent fea- tures). There are major disagreements regarding the existence of fundamental differences between terminal taxonomic components (e.g., species) and higher groups (e.g., polytypic genera and families), as appar- ently accepted by Hennig, for example (and probably Mayr), and rejected by Gilmour (1940) and Nelson (1989).

D. Evolutionary Systematics: Grades and Clades

The Darwinian revolution provided a rationale ‘‘exter- nal’’ to the human mind for the basis of taxonomy: Hierarchical relationships among living things are ex- plicable as the result of organic evolution, and taxono- mists should strive to reflect these patterns as the logical basis of a general classification. However, if taxonomists looked to Darwin for a more detailed lead, what did they find? Unfortunately, at various places in the Origin, Darwin appeared to shift between general evolutionary statements (‘‘the natural system is founded on descent with modification’’), phylogenetic assertions (‘‘the ar- rangement of groups within each class . . . must be strictly genealogical in order to be natural’’), and nomi- nalism (‘‘I look at the term species, as one arbitrarily given for the sake of convenience’’). Moreover, the pre- evolutionary taxonomic hierarchy had served the Dar- winians well, presaging a continuing debate to this day: Is the theory of evolution relevant to the pursuit of taxonomy only as justification for seeking a single gen- eral system, or should we build into the taxonomic method theories about the way in which evolution has occurred?

Whatever the precise reason, the immediate impact of Darwinism on method was very limited. Taxonomy remained highly individualistic, and essentialist, nomi-

nalist, empiricist, and even creationist views all contin- ued, although often cloaked in evolutionary language. The first move to formulate a general method post- Darwin did not occur until the 1920s, later epitomized by two books by Julian Huxley: The New Systematics (1940) and Evolution: the Modern Synthesis (1942). This emergent approach was developed and defended most notably by George Gaylord Simpson and Ernst Mayr, and it became evolutionary systematics or evolution- ary taxonomy.

With respect to Darwin, the evolutionary system- atists argued that the natural system (Fig. 1) should reflect descent with modification, to include the pro- cesses by which evolutionary change occurs, the mea- surable degree of modification (anagenesis), and the temporal sequence of divergence (cladogenesis). Ac- cording to Simpson (1961, p. 107), evolutionary sys- tematics requires that a natural classification should be‘‘consistent with all that can be learned of the phylogeny of the group classified,’’ but he was emphatic that this does not mean that natural classification be based on phylogenetic relationships alone [a view also clearly stated by Gilmour in The New Systematics (1940)]. This led Simpson to adopt a very broad notion of monophyly as ‘‘the derivation of a taxon through one or more lin- eages . . . from one immediately ancestral taxon of the same or lower rank.’’

FIGURE 1 Evolutionary systematization. Horizontal axis, anagene- sis; vertical axis, time. In this hypothetical case, the slope of each branch corresponds to the rate of anagenesis, with the percentage values representing the degrees of difference in the three lineages from the ancestral species (A). B is grouped with C because the two are more like their common ancestor than either is to D (even though D shares a common ancestor with C at X that it does not share with B). Note that this method requires a means of measuring similarity or distance (phenetics), estimating phylogenetic relationships (cladis- tics), and knowledge of ancestors (paleontology) [based on Mayr (1974) and Patterson (1982)].

Page 8: (260818776) M�todos de taxonom�a

596 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

To be natural, according to Mayr (1969), a classifica- tion must have explanatory, predictive, and practical values but also be emendable in the light of new evi- dence or understanding. Mayr then proposed that the fundamental basis for naturalness is the proportion or number of genes held in common by any two taxa: ‘‘If we knew the entire genotype of each organism, it would be possible to undertake a grouping of species that would accurately reflect their ‘natural affinity’ ’’ (p. 81).

The empirical approach closest to this new ideal of evolutionary systematics is the use of pairwise distance data representing, in some way, the entire genome (e.g., immunological data and notably DNA–DNA hybridiza- tion data used by Sibley and coworkers for bird classifi- cation). In addition, complete nucleotide sequences for the entire genomes of various organisms are now be- coming available for direct comparison. These ap- proaches, however, founder on some of the fundamen- tal problems of numerical taxonomy. Mayr’s proposal that ‘‘genes-in-common’’ provides the ultimate arbiter of natural classification, however, is an important concept because it encapsulates the only well-articulated rival to the phylogenetic nexus idea first suggested by Darwin (‘‘the arrangement of groups . . . must be strictly genea- logical in order to be natural’’). If lateral gene transfer undermines a strictly hierarchical approach, then the estimation of genes-in-common will certainly increase in importance as the basis of a natural system.

In practice, evolutionary systematics became a syn- cretistic, all-embracing method that included a regard for the absence of characters as informative and insisted on the primacy of paleontology for revealing phyloge- netic sequences (Fig. 1). The method’s most striking characteristic involved conflation of the two methods that soon sought to replace it: a desire to give expression to genetic distances (grades or anagenesis, as reflected in numerical taxonomy) and, at the same time, to ances- tor–descendant relationships (clades, or cladogenesis, as reflected in cladistics). Because these two methods stem from fundamentally different philosophies, this led to an inevitable arbitrariness (Fig. 1). Thus, Simpson (1961, p. 107) was happy to write that although ‘‘taxon- omy is a science . . . its application to classification involves a great deal of human contrivance and . . . there is a leeway for personal taste, even foibles.’’ This lack of explicitness led to the demise of evolutionary systematics as the leading method in taxonomy. During its development, however, Ernst Mayr in particular made an enormous contribution, especially to ideas on the taxonomy of species, with which he was preoccu-

pied as ‘‘the basic unit of classification’’ (Mayr, 1963, p. 11).

E. Numerical Taxonomy and Operationalism

Numerical taxonomy (or phenetics) emerged in the late1950s, its origin associated with, among others, Charles Michener, Arthur Cain, and especially Robert Sokal and Peter Sneath. In Sokal and Sneath’s original 1963 manifestation of the Principles of Numerical Taxonomy, any evolutionary approach is avoided in favor of an operational method based on direct comparison of phe- notypes. As many characters as possible of the organ- isms to be compared, both continuous and discontinu- ous, are measured and counted from operational taxonomic units (OTUs), which can be individuals or samples from conventionally recognized taxa (typically species). On the basis of a matrix of variation in all features across all OTUs, the OTUs are then compared by overall similarity, affinity, or phenetic distance. Such measures can be obtained by transforming the raw ma- trix (Table I) to give the proportion of all character matches (affinity) or mismatches (distance: Table II) for every pairwise combination of OTUs. The results are displayed by means of a network (Fig. 2a), or OTUs are linked to each other by a clustering algorithm to produce a phenogram (Fig. 2b).

Although such a procedure appears objective at first, many different ways have been proposed to measure pairwise similarity or dissimilarity, and many different clustering methods have also been devised. Most clus-

TABLE IAmino Acids Found at Eight Particular Positions in the

Myoglobin Chain of Four Operational Taxonomic Units (OTUs): (A) Human, (B) Alligator, (C) Tuna fish, and

(D) Heterodontus Sharka

Characters

OUT 0 1 2 3 4 5 10 11

A — G L S D G V L B M E L S D Q V L C — — — — — A V L

D — — — — — T V N

a Single letters stand for particular amino acids and dashes for positions not represented based on alignment of the four complete myoglobin sequences (from Patterson, 1980, p. 237).

Page 9: (260818776) M�todos de taxonom�a

597 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

TABLE IIData of Table I Transformed Into a Distance

Matrix, Based on the Proportion of Mismatches Summed across All Eight Positions, for All Six

Pairwise Comparisons of the Four OTUs a

A B C D

A 0 .375 .625 .75B 0 .75 .875C 0 .25

D 0

a These distances are replicated in the branch lengths of an unrooted network (Fig. 2a). Taken as reciprocals to give measures of similarity, the data can also be used to produce a phenogram (Fig. 2b).

FIGURE 2 Numerical systematization. (a) The values in Table II between the four OTUs A–D represented by a distance network. The scaled lengths of the line segments are such that all the relative values in Table II are satisfied, but the angles of the four branches leading to the terminals have no special meaning. (b) Horizontal axis, linkages; vertical axis, overall similarity. By taking reciprocals of the values in Table II, the data give a similarity matrix—the basis of a phenogram. With a value of 75% similarity, (C D) are the first OTUs clustered. Comparing A with B, and (C D) with A and B separately, (A B) form the next most similar cluster, linked at 62.5% similarity. The two clusters can then be linked together at 37.5% similarity based on three of the eight attributes occurring in one or both members of the two groups. Other linking procedures could be adopted.

tering algorithms employed in taxonomy are sequen- tial, agglomerative, hierarchic, and nonoverlapping (SAHN). Among this class of methods there are sub- classes (e.g., single linkage, complete linkage, and aver- age linkage), which in turn have variations (Sneath and Sokal, 1973). The phenograms produced by SAHN algorithms are dendrograms and thus strictly hierarchi- cal and lacking reticulations, appearing similar to a Linnean hierarchy or a cladogram.

Initially, it was expected that agreement would be reached on best if not ideal procedures, but this proved an illusion. As early as 1966, Hennig (p. 85) commented that ‘‘even the most recent authors present and recom- mend their own methods [for measuring morphological differences] but never explain why their method de- serves preference over those of their predecessors.’’ De- spite these difficulties, the pheneticists pursued the ideas of numerical taxonomy with vigor based on four key hypotheses proposed by Sneath and Sokal that can be viewed as attempts to realize Mayr’s goal of measur- ing genes-in-common (Panchen, 1992, p. 135). The nexus hypothesis stated that every character is likely to be affected by more than one gene, and every active gene is likely to affect more than one character. The nonspecificity hypothesis suggested that no major or distinct group of genes would exclusively affect one class of attributes (e.g., morphology and behavior); thus, classifications based on one character class or set should not differ from classifications based on another class or set, and there would be no a priori reason to prefer particular data sets to others. The hypothesis of the factor asymptote supposed that as data were added, the information content of the classification would reach an asymptote, whereas the hypothesis of the matches asymptote proposed that as the number of characters sampled increased, the similarity value for any pair of OTUs would tend toward a final, asymptotic value, thus conferring stability on the classificatory system.

Regrettably for phenetics, as discussed by Panchen (1992) and even acknowledged by Sneath and Sokal as early as 1973, these hypotheses, apparently basic to the validity of the methods, are now largely discredited (in particular the all-important nonspecificity hypothesis). More generally, phenetics offers no justification for choosing a hierarchical system unless it is accepted that the general theory of evolution dictates this as the most efficient representation. However, because numerical taxonomy conflates homologous and nonhomologous characters, and neither nodes nor branch lengths can be directly related to hypotheses about particular char-

Page 10: (260818776) M�todos de taxonom�a

598 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

acters, the origin of characters, or even the degree of genetic change, the notion of overall similarity lacks analytical power.

Moreover, although it was originally expected that its procedures would lead to stability in classification, several factors preclude this (Eldredge and Cracraft,1980, p. 176). In addition to the technical ambiguities already noted, every new taxon requires complete re- analysis, with the unique attributes and combinations of attributes usually affecting every branch length and often many branching points in the phenogram. Of course, other taxonomic methods, including cladistics, are not invulnerable to change due to new discoveries, but in cladistic analyses the effects are interesting (be- cause they can be related to hypotheses about homol- ogy) and cladists never laid serious claim to the idea that stability in the face of new evidence was an important justification for the method.

Cladistics also suffers from its own algorithmic prob- lems, but it is always possible to work out what different procedures are doing with respect to the data and the inferences drawn. With phenetic methods, the nature of the algorithms and the form of the results are not separable. Thus, a basic problem of numerical phenetics is the lack of a clear criterion of choice (such as parsi- mony, or even a model of evolution) by which one result can be judged against another. Even Mayr’s concept of genes-in-common will not repair the difficulty because of innumerable problems related to the definition of a gene, duplication, alignment, position effects, and so on. The only other external arbiter is the indefinable goal that James Farris called Gilmour-naturalness (‘‘a system of classification is the more natural the more propositions can be made regarding its constituent classes’’), enthusiastically embraced by Sneath and So- kal (1973). This led Panchen (1992) to suggest that the original philosophy of numerical phenetics is closer to empiricism (not nominalism, as suggested by Mayr). Given enough observations and an appropriate tech- nique, the numerical taxonomists expected that a single, stable, and predictive general system would simply emerge.

Although numerical taxonomy has been judged wanting, it has had a lasting and positive influence on current taxonomic methods and still flourishes in areas such as bacterial taxonomy, in which the general ab- sence of tokogenetic relationships negates many advan- tages of a phylogenetic system. The most important contributions of numerical taxonomy have been techni- cal, notably the use of data matrices, precise comparison across all taxa under scrutiny, and the adoption of algo- rithmic analytical methods. Advances in other methods,

notably cladistics, have come not only from vigorous debate but also from the general adoption, post-Hennig, of data matrices and their exploration through numeri- cal algorithms.

F. Phylogenetic Systematics and CladisticsThe fundamental methods of phylogenetic systematics were established by Willi Hennig, who wanted to base systematization and classification directly on the histor- ical branching patterns of the phylogenetic nexus. He realized that it would never be possible to know the course of history precisely, and that the system would always be provisional. His goal was, by means of appro- priate methods, to produce a ‘‘phylogenetic system’’ that would ‘‘approximate more closely than any other the ideal system [italics added] that reflects the phylogenetic relationships absolutely correctly’’ (Hennig, 1966, p.29).

Hennig’s method was ‘‘the search for the sister group,’’ epitomized as the ‘‘three-taxon problem.’’ For any group of three natural taxa, the expectation is that two have a common ancestor not shared by the third. Thus, for the trio shark, tuna, and human, of the three possible combinations of two from three, the one with tuna and human grouped to the exclusion of shark accords with what we know of ancestry (see Table III). According to Hennig’s view, we should then recognize a taxon linking tuna and human as sister groups

TABLE IIIData from Table I Transformed to Show only Coincidences of Positive Attributes among the

Four OTUsa

Character

Taxa 2 3 4 11

A * * * * B * * * * C *

D

a Characters 2–4 link (A B), and character (11) links (A B C). These data are most efficiently represented by the nested-set pattern (D (C (A B))); compare with the pattern ((D C) (A B)) implied by the phenetic analysis (Fig. 2). In this analysis, four characters (0, 1, 5, 10) in Table I are uninformative because there is no positive coincidence (5), only shared absence (0), both (1), or only shared coinci- dence (10).

Page 11: (260818776) M�todos de taxonom�a

599 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

(Osteichthyes) but not one linking shark and tuna: From a phylogenetic perspective, Pisces are paraphy- letic, or a nongroup. Thus, Hennig defined phylogenetic relationships solely in terms of common ancestry (pro- pinquity of descent) and did not include similarity, distance, grade, adaptive zone, or any comparable con- cept in his assessments.

This line of argument caused Mayr and other evolu- tionary systematists to object very strongly to Hennig’s approach, accusing him of an unjustifiable restriction of the concept of relationship. For this reason, Mayr referred to Hennig and his followers as ‘‘cladists’’ (from clade or cladogenesis) to criticize them for their narrow approach—with the unexpected result that the term was happily adopted by the growing band he sought to reprove.

However, how could clades be recognized in the absence of prior knowledge or without acceptance of sequences simply ‘‘read’’ from the fossil record? Argua- bly Hennig’s greatest methodological advance was the introduction of relativism to the concept of characters. He recognized three classes with respect to a set of taxa under study: autapomorphies (characters unique to in- dividual terminal taxa), symplesiomorphies (characters present in all the taxa), and synapomorphies (characters shared by subsets of two or more taxa). According to Hennig, only characters of the third class provide evi- dence of grouping at the specified level of inclusiveness. Thus, a synapomorphy indicative of a natural group of species within the taxa under review (e.g., a genus) becomes a symplesiomorphy when addressing ques- tions of relationship among species within that genus so delimited. Ideally, each and every character would be informative of relationships at just one level in the total hierarchy. Thus, virtually all the characters that might seem to unite shark and tuna to the exclusion of human (e.g., gills and fins) are actually characters that relate to more inclusive levels of the hierarchy (e.g., Chordata and Vertebrata). Observations such as‘‘absence of mammary glands’’ or ‘‘absence of hair’’ are not characters linking ‘‘fish’’ but the counter conditions of autapomophies (of the Mammalia).

To see this in practice, the data in Table I have been transformed (Table III) to show only potential synapomorphies, i.e., the positive attributes shared by two or three of the four taxa, and thus able to provide evidence of grouping within (A–D). In this simple ex- ample there are no conflicting distributions, and there is just one maximally efficient solution. Efficiency means the hierarchical arrangement which captures all (or, where there is conflict, the largest number) infor- mative characters (putative synapomorphies). In Fig.

3, all 15 possible fully resolved groupings of terminal taxa A–D are presented, with the characters that they can summarize marked at the relevant nodes. Arrange- ments 1–7, 9, 14, and 15 do not reflect any of the characters. Arrangements 10 and 12 capture one, and8 and 13 capture three, but only arrangement 11 cap- tures all four. The best cladistic arrangement, on the available data, is therefore (((A B) C ) D).

Colin Patterson (1980) summarized ‘‘the axioms of cladistics’’ as homologous characters having a hierarchi- cal pattern in nature in which this pattern is efficiently expressed by cladograms, and the nodes connote the homologies shared by the organisms so grouped. The search for the sister group therefore reduces to finding the cladogram that summarizes the potentially homolo- gous characters as parsimoniously as possible. Once this best fit cladogram has been found, the attributes at each node are hypothesized as homologous characters shared by the taxa subtended at that node.

One of the most challenging features of cladistics (or at least ‘‘transformed cladistics’’—see Section VII) is the link between attributes, grouping, parsimony, characters, and homology. An attribute is accepted as a character if the weight of evidence from all other attributes under scrutiny suggests that it is a homolo- gous feature peculiar to a particular group. In Table IV, two potential synapomorphies (x and y) have been added to the distribution of characters in Table III. Given these additional data, and presupposing indepen- dence of the attributes, the most efficient grouping of the four taxa based on positive shared features now changes and highlights a problem with the interpreta- tion of attribute 11: Either it is not relevant as a charac- ter within the group and has undergone reversal in taxon D or it is not a character at all (it is nonhomolo- gous). Information about this could be sought, for ex- ample, by studying the ontogeny or some more funda- mental quality of attribute 11 in A–C, revealing a particular scientific strength of the cladistic method: Conflicts can be resolved by character analysis (Kitch- ing et al., 1998) involving recourse to additional data derived from previously unsampled or unused attri- butes or by investigation into the homology of the con- flicting characters (they are not simply aggregated at face value, as in numerical taxonomy).

If characters x and y in Table IV were found to be due to pleiotropic effects of a single gene, then rejection of attribute 11 as a character relevant to resolving the relationships within the group would be premature. If, on the other hand, character 11 was a certain color produced, for example, by a pigment shared by taxa A and B, but the same hue was produced by a totally

Page 12: (260818776) M�todos de taxonom�a

600 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

FIGURE 3 Cladistic systematization. Horizontal axis, convention; nodes, pattern of homologous characters. The putative synapomorphies in Table III plotted on all 15 possible arrangements (as fully resolved cladograms) for four terminal taxa. Numbers at given nodes indicate the shared attributes from Table III; arrangement 11 is the only one that reflects all four attributes, all of which can then be construed as homologous characters (based on Patterson, 1980).

different type of pigment in taxon C, this would be consistent with the idea that attribute 11 as originally formulated (color) was a noncharacter. The similarities and differences between taxa A–C in this regard should then be rescored as two new attributes (e.g., based

TABLE IVImaginary Character Matrix for Four Taxa and

Six Charactersa

Character

Taxa 2 3 4 11 x y

A * * * *B * * * *C * *D *

a Characters 2–4 and 11 correspond to Table III. Additional characters x and y are in conflict with char- acter 11. The most parsimonious solution, assuming that all characters are independent, is to group the four taxa as ((A B) (C D)). This implies that attribute11 is no longer considered homologous [not ‘‘the same’’ in C as in (A B)] or that it is symplesiomorphous with respect to the whole group (A B C D) and does not appear in taxon D (e.g., in evolutionary terms, this would imply that character 11 had been secondarily lost in taxon D).

on chemistry), giving further support to the (A B) grouping and removing conflict with the (C D) grouping. The convergence in color between (A B) and C could then be seen to be in the eye of the human observer. Such convergences are of intense interest for the study of evolution, but once detected they play no part in natural classification.

Thus, the relativistic nature of cladistics is empha- sized. Based on the whole (attribute taxon) matrix, the cladograms chosen and attributes thereby specified as homologous characters are probability statements:‘‘Common ancestry versus convergence is tested by top- ographical correspondence [on the cladogram]. The re- sulting explanation is a statement of maximal likelihood rather than a denotion of lawful relations or processes’’ (Rieppel, 1988, p. 166; see Hennig, 1966, p. 29).

V. GENERAL PROCEDURES

The main operations involved in taxonomic research are listed as follows in an idealized order. In practice, this order is rarely followed precisely, and certain steps may be omitted in part or even altogether:

1. Individual specimens, or individuals representing selected taxa (ideally at least all immediately subordi-

Page 13: (260818776) M�todos de taxonom�a

601 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

nate taxa currently recognized within the group under study), including material representing suitable out- groups for comparison, are chosen.

2. A selection of attributes to be scored across all samples of taxa under study is made (in original work this will usually be a mixture of known and novel fea- tures).

3. The manifestations of these attributes are system- atically recorded in an individual or taxon/attribute data matrix (in the process of doing this, it is normal for other potentially informative features to be recog- nized and these may be added to the list under stage2; it is not unusual at this stage for additional study material to also be called for, when variation has become apparent or specimens prove to be incomplete or other- wise unsuitable).

4. Systematization of the data is carried out by cladis- tic analysis to produce a cladogram based on homolo- gous characters defining each node (or for phenetic data, by an agglomerative numerical technique to pro- duce a phenogram); in cladistic studies this may involve one or more rounds of Hennigian ‘‘reciprocal illumina- tion,’’ notably with respect to testing the coherence of putative homologies that seem to be in conflict (incon- gruent).

5. Inclusive monophyletic groups inferred from the parsimonious distribution of homologous characters are thus linked together and then named as taxa, to produce a hierarchical classification.

6. The homologous characters are used to help de- velop diagnoses for the various taxa recognized (as a potential aid for identification, information on charac- ters that are absent may also be included, and keys, tables, or computerized identification programs may also be constructed at this point).

7. Publication of the results will ideally include the provision of necessary descriptions, indications (diag- noses), specifications of types, and formulation of names in accordance with the relevant code (e.g., zoo- logical, botanical, and bacteriological codes) as well as a record of the complete character matrix and specifica- tion of the analytical procedures employed (e.g., com- puter programs and options and models in the case of maximum likelihood analyses).

Cladistic and numerical phenetic methods are essen- tially the same up to stage 3, but a crucial difference occurs at stage 4, the process of systematization. Stages5 and 6 also differ, but both methods involve some arbitrariness with respect to where the phylogenetic nexus is to be cut or the phenogram to be divided (see Section VI). At stage 6 there is also some difference in that the very act of cladistic analysis gives the primary

characters for diagnosis, whereas the numericist is faced with an additional searching process to enumerate a sufficient set (usually a polythetic set) of characters whereby given members of a specified portion of the phenogram can be recognized in isolation. Fundamen- tal differences in approaches to systematization have already been discussed. The problems inherent in stage5 are discussed next.

VI. FROM SYSTEM TO CLASSIFICATION

Classification involves translating a systematization scheme into words (or numbers). Based on the methods of logical division, Linnaeus (who established many of the conventions of formal classification) placed all known organisms within a descending series of fully nested hierarchical categories. The major ranks in the Linnean system were kingdom (e.g., plants and ani- mals), class (e.g., Mammalia, Aves, Pisces, and Insecta), order (e.g., Hemiptera, Lepidoptera, Coleoptera, and Diptera), genus (e.g., Culex, Tipula, and Musca), and species [e.g., Musca domestica (housefly) and Musca vomitoria (bluebottle)]. For Linnaeus, systematization and classification were the same. Modern classifications attempt to summarize far more complex schemes. In this brief review, only the major differences between the major twentieth-century methods are considered.

According to the evolutionary method, classification should be viewed as a useful art, meaning that in addi- tion to scientific analysis, human ingenuity is also needed to produce a practical classification. Simpson (1961) recognized three principles: A classification should reflect the most ‘‘biologically significant’’ rela- tionships among the organisms, it should be consistent with the relationships on which it is based, and it should be as stable as possible without contravening the first two principles. These ideas were elaborated to embrace, most notably, grades and clades (anagenesis and clado- genesis), both of which were regarded as significant for classification. This led to the idea that, even if literally everything relevant were known about the connections among the members of a group, ‘‘innumerable different classifications could be made consistent with those in- terrelationships. . . . Selection among those alterna- tives is decidedly an art’’ (Simpson, 1961, p. 110).

To appreciate the difficulties resulting from such a view, consider Fig. 4. In cladistic terms, three paraphy- letic groups (basal groups in A–C) are delimited, to- gether with just 4 of the 34 subclades depicted. Why were these particular groupings chosen? Such a mysteri-

Page 14: (260818776) M�todos de taxonom�a

602 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

FIGURE 4 Evolutionary classification. The original caption reads: ‘‘Three hypothetical phylogenies. . . divided into taxa as shown by the broken lines. A is evidently less arbitrary . . . than B, and B somewhat less than C’’ (redrawn from Simpson, 1961, Fig. 7).

ous art proved difficult to follow (especially when the prescription for creating such trees in the first place is also imprecise). As discussed previously, Ernst Mayr urged that classifications be based on genes-in-com- mon. Because it is a distance method, this transforma- tion of evolutionary systematics encounters the same problems of classification faced by numerical taxono- mists. Even so, the recent estimation that man and chimpanzee share 98.4% of their genome in common has led, for example, to the long-overdue abandonment of the paraphyletic or grade family Pongidae in favor of an expanded Hominidae, in contradistinction to the position adopted for decades by the evolutionary sys- tematists, who argued repeatedly that the ‘‘large gap’’ between Homo and the rest of the great apes was reason enough to place them in separate families (Simpson,1961; Mayr, 1963).

Sneath and Sokal (1973) proposed that good classi- fications have three desirable properties: naturalness, ease of manipulation, and practicality for information retrieval. Their concept of naturalness, as already dis- cussed, was based on the ideal of incorporating all codi- fied information about a very large numbers of charac- ters. Manipulation related to the practical relationship between a classification and its degree of hierarchical structure (useful for memorizing) and its utility (e.g., for the construction of identification keys). Conve- nience for information retrieval was also viewed as im- portant, but not if it conflicted with natural classifi- cation.

The typical product of systematization in numerical taxonomy is the phenogram. How is such a dendrogram to be turned into a summary classification of named groups? An early proposal was the use of phenon lines (Fig. 5). A phenon line cuts across the phenogram at a particular similarity level. The lines must be straight

and are not allowed to ‘‘bend up and down according to . . . whim’’ (Sneath and Sokal, 1973), not only de- limiting groups but also determining their rank. Thus, in Fig. 5, if OTUs 1–10 are species, the 80% phenon line could indicate seven subgenera, the 75% line four genera, the 65% line three subfamilies, and so on.

Such a procedure is not objective. The most funda- mental problem is simply that the choices of percentage similarity levels and ranks are arbitrary, both within groups and between them. Other procedures, such as McNeill’s method based on the assignation of rank lev- els to each node, also involve arbitrary limits and adjust- ments. The difficulty for phenetics is that ‘‘there is no necessary structural relationship between the pattern of clusters in hyperspace and the inclusive Linnean hierarchy into which they are converted’’ (Panchen,1992, p. 151). If this is so, then there can be no nonarbi- trary method for converting a phenogram (or any other distance-based method of systematization, including DNA–DNA hybridization trees) into a ranked sum- mary classification.

FIGURE 5 Phenetic classification. ‘‘The formation of phenons from a phenogram’’ (redrawn from Sneath and Sokal, 1973, Fig. 5.33).

Page 15: (260818776) M�todos de taxonom�a

603 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

Cladistic classification is based on the tenet that ev- ery putative monophyletic group can be named, but polyphyletic and paraphyletic groups should not. In a grand, top-down cladistic classification of all life, all coordinate monophyletic taxa (sister groups) could be placed at the same rank. Alternatively, taxa of the same geological age could be given the same rank (Hennig,1966). Also, classifications could be formed bottom upwards by linking all terminal sister species starting with the lowest ranking pair or pairs and building up classes and ranks based on sister groups and ascending ranks held coordinate across the entire hierarchy. Such grand visions remain almost wholly impractical due to our very imperfect knowledge. Moreover, such systems would cause massive proliferation of named groups and ranks and be unstable in the face of most newly discov- ered taxa and every older geological find. In terms of producing a useful summary, cladistic classification thus faces potential problems of impracticality as well as arbitrariness.

In practice, when classifications are based on clado- grams an attempt is usually made, within conventional ranks (e.g., order, family, and genus), to give expression to major monophyletic groups by naming inclusive taxa at intermediate levels. However, this usually means abandoning any equation of classificatory rank with clade level and any attempt to give all groups in the

cladogram formal recognition. Thus, in a labeled clado- gram of the main lineages of the order Lepidoptera (Fig.6), not all putative monophyletic groups are named, and all terminals (despite being almost completely non- coordinate) are classified at the same superfamily rank except one of the cladistically most inferior groups, the crown group Ditrysia, which appears to retain its subordinal status accorded in earlier, precladistic classi- fications.

One partial solution to these difficulties is the ‘‘se- quencing’’ convention, first proposed by Gary Nelson. This attempts to combine the practicalities of a limited Linnean hierarchy with a listing that enables the entire (cladistic) hierarchy to be recovered. A sequence of taxa named at the same rank indicates that the first is sister to all following at that rank, the next is sister to the remainder, and so on. Taxa of uncertain position can be annotated as incertae sedis and those comprising unresolved polytomies as sedis mutabilis. In presenting these sequences, comprehension is greatly facilitated by indentation. Thus, the cladogram for the major groups of Lepidoptera in Fig. 6 can be converted into a written classification (Table V).

In conclusion, the idea that classification can simply be equated with systematization is a vestige of preevolu- tionary taxonomy and should be abandoned (Minelli,1993, p. 14). In practice, reflecting a Simpsonian view,

FIGURE 6 Cladistic classification. Labeled cladogram of main lineages of the order Lepidoptera (slightly modified from Scoble, 1992, Fig. 185). In this classification, two of the major monophyletic groups recognized are not named (X), and the ranks of the terminals are noncoordinate. The dashed line to the Lophocoronoidea indicates that the relationships of this group are uncertain: They may form the sister group of the Exoporia, the Neolepidoptera, or the Myoglossata (Scoble, p. 203).

Page 16: (260818776) M�todos de taxonom�a

604 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

TABLE VSequenced, Tabular Classification of the

Lepidoptera, Reflecting Scoble’s Cladogram (Fig. 4)

Order Lepidoptera Suborder Zeugloptera Suborder Aglossata Suborder Heterobathmiina Suborder DacnonyphaSuborder Lophocoronina incertae sedisSuborder NeopseustinaSuborder Exoporia

Infraorder ‘‘Mnesarchaeoidea’’ Infraorder ‘‘Hepialoidea’’

Suborder HeteroneuraInfraorder ‘‘Incurvariioidea’’ sedis mutabilis Infraorder ‘‘Nepticuloidea’’ sedis mutabilis Infraorder ‘‘Palaephatoidea’’ sedis mutabilis Infraorder ‘‘Tischerioidea’’ sedis mutabilis

Infraorder Ditrysia sedis mutabilis

classifications are made consistent, as far as possible, with current systematization. Named para- and poly- phyletic assemblages are rejected whenever strong evi- dence of nonmonophyly becomes apparent, with such groups being very misleading for comparative biology, biogeography, ecology, and extinction studies (al- though ‘‘dicots,’’ ‘‘fish,’’ ‘‘reptiles,’’ and ‘‘invertebrates’’ seem impossible to eliminate). Restraint is necessary to avoid unstable changes to the formal classification arising from partial or inconclusive evidence (careful use of consensus trees may be helpful here) and overela- borate nomenclature for the numerous hierarchical lev- els. With the real prospect of international consensus systems covering all major groups of organisms being developed on the Internet, a balance between stability and universally accessible periodic revision to reflect fundamental advances can be anticipated. Such IT- based bioinformatic systems should also solve most problems of information retrieval, including the ever- present difficulties of alternative classifications, synon- ymy, and misidentifications (the most pernicious of all classification problems affecting information retrieval).

VII. CURRENT PRACTICE: VARIATIONS ON A CLADISTIC THEME

Numerical taxonomy was afflicted by a proliferation of clustering algorithms, each tending to give different

results for which no logical criterion of choice was available. The vaunted objectivity of phenetics dis- solved, leaving the field of systematization open to the apparently more consistent and decisive methods of phylogenetic systematics. However, matters are rarely this simple.

In the 1970s, a partial schism opened between those phylogeneticists who followed directly in Hennig’s foot- steps, approaching character transformation from an evolutionary perspective, and the so-called ‘‘trans- formed cladists’’ (Patterson, 1982), who held that par- ticular theories of evolution were unnecessary for cla- distics, taking instead a ‘‘taxic approach’’ to cladistic analysis (Kitching et al., 1998). The cladists embraced computer algorithms, character matrixes, and global parsimony, leaving polarization of characters to the al- gorithms and outgroup choice. For Hennig, the nodes in a cladogram were hypothetical ancestors; for the transformed cladists the nodes indicate the hierarchical pattern of homologous characters. These differences have emerged in debate over congruence versus ‘‘total evidence’’ methods, now treated formally as partitioned versus simultaneous data analysis. Because many clad- ists claim to have demonstrated the superiority of simul- taneous analysis for making cladograms, this debate links in turn to a seemingly fundamental disagreement over whether or not to employ Fisher’s maximum likeli- hood statistics in phylogeny reconstruction, as first pro- posed by Edwards and Cavalli-Sforza and later imple- mented by Joseph Felsenstein as a general method (Huelsenbeck and Crandall, 1997).

The real source of these disagreements seems to be one of different scientific agendas: Those in favor of model-based likelihood methods and partitioned data sets are mainly seeking insights into evolutionary pro- cesses, most notably those affecting molecular evolution within different lineages or under different constraints. Those rejecting likelihood models in favor of global parsimony and simultaneous analysis of heterogeneous data sets are primarily concerned with tests of homology and the construction of cladograms (but see Wiens,1999). Both approaches have different strengths and weaknesses, and both make simplifying assumptions in pursuing these different goals (Kitching et al., 1998, p.165). Because most work affecting taxonomy requires use of morphological data, or a combination of morpho- logical and molecular data, for which plausible process models cannot be elaborated for simultaneous analysis using likelihood, the significance of likelihood methods for taxonomy per se (rather than understanding phylo- genetic processes; Huelsenbeck and Crandall, 1997) remains to be demonstrated.

Page 17: (260818776) M�todos de taxonom�a

605 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

However, with respect to the reconstruction of phylogeny there are several other methods, global parsimony and maximum likelihood being only main contenders. These alternatives include James Lake’s rate-invariant method for molecular data and pairwise distances (e.g., the DNA–DNA hybridization method noted previously). Even within the parsimony tech- niques generally adopted for cladogram construction, at least one fundamentally different challenge to global parsimony has emerged—the three-item statement analysis, first proposed by Gary Nelson and Norman Platnick. Three-item analysis discards any presupposi- tion that characters undergo evolutionary transforma- tion and, by decomposing a taxon attributes matrix into an analysis of all possible three-taxon statements supportable by the data, focuses entirely on evidence for sister group relationships rather than character transformations. Although not widely acclaimed, this method is operational and often gives more or less different results from those of conventional parsimony methods, thus raising many fundamental issues. For a brief but informative discussion of the current debate, see Kitching et al. (1998, Chapter 9).

VIII. CONCLUSIONS

Taxonomy involves the search for a general pattern of order among living and extinct organisms, from which a universal reference system (or classification) is de- rived. As such, taxonomy is the primary discipline of biodiversity. Fundamental philosophical disagreements about the nature of human knowledge have given rise to several basic methods. All recent approaches agree that systematics is empirical and should be based largely (evolutionary systematics and maximum likelihood molecular systematics) or entirely (numerical taxon- omy and cladistics) on the comparison of data in the form of characters observed and abstracted from speci- mens. The methods differ somewhat with respect to the type of data preferred, and fundamentally with regard to methods of analysis used to reveal pattern, and how the system, once obtained, is translated into a written classification.

Cladistics has become the dominant method of tax- onomy (where a preferred method is made explicit). The research program of cladistics is based on the view that the ideal general system should reflect the phyloge- netic nexus. However, the true nexus cannot be known. If the precise course of evolution, taking account of all the relevant details of every lineage, is real but unknow- able, then the natural system cannot be discovered but

only approximated by an indirect process of estimation. Because cladistic classification can only admit mono- phyletic groups (based on the selection of a particular cladogram), all admissible groups are necessarily de- fined because they must have at least one unique (pre- sumptively) homologous character (or a unique loss within a more inclusive group that is specified by some other homologous feature). Thus, as noted by both Rieppel and Panchen, the cladistic method is in practice essentialist because the search for the sister group de- pends on discovery of locally unique and thus defining shared characteristics for each and every group. It has long been understood that although evolution is not bound by parsimony, scientific method admits nothing better than parsimony as the criterion of choice among competing hypotheses. Insofar as the divergent model of evolutionary history which justifies searching for a hierarchical pattern is flawed (Doolittle, 1999), the methods of cladistics may need further modification, notably to deal more adequately with hybrid origins.

In conclusion, we may have to accept the seeming paradox that although the principle to which cladistics aspires is natural (i.e., the groups we seek to recognize exist regardless of human perception; Rieppel, 1988, p.163), its empirical methods are inescapably essentialist. The basic method of taxonomy may be equivalent to the construction of a universal, never to be experienced directly but derived from the sensory experience of particulars (samples of individuals and characters) by means of an unending iterative sequence of analysis, hypothesis formation, testing, and reanalysis. This is a manifestation of the ‘‘two ways of seeing’’ explored by Rieppel (1988): The world of being owes its form to the processes of becoming, but those evolutionary pro- cesses can only be inferred indirectly from the patterns we observe in nature.

Finally, it is notable that current approaches to the measurement of biodiversity that seek to maximize the number of expressible genes held in networks of germ- plasm banks or functional ecosystems depend on mod- els incorporating information about branching se- quences (cladistic relationships) and branch lengths (anagenesis). In practice, the taxonomic hierarchy and the raw data matrix (before removal of homoplasy and autapomorphies) from which it has been derived will often be the best or only available data for use in such models. For a review of this field, see Crozier (1997).

See Also the Following ArticlesCLADISTICS • CLADOGENESIS • DIFFERENTIATION •EVOLUTION, THEORY OF • GENES, DESCRIPTION

Page 18: (260818776) M�todos de taxonom�a

606 TAXONOMY , METHOD S OF TAXONOMY , METHODS OF

OF • GENETIC DIVERSITY • NOMENCLATURE, SYSTEMS OF • NUCLEIC ACID BIODIVERSITY • PHENOTYPE, A HISTORICAL PERSPECTIVE • PHYLOGENETICS • SPECIATION, THEORIES OF • SPECIES, CONCEPTS OF

BibliographyBerlin, B. (1992). Ethnobiological Classification: Principles of

Categori- zation of Plants and Animals in Traditional Societies. Princeton Univ. Press, Princeton, NJ.

Crozier, R. H. (1997). Preserving the information content of species: Genetic diversity, phylogeny, and conservation worth. Annu. Rev. Ecol. Syst. 28, 243–268.

Doolittle, W. F. (1999). Phylogenetic classification and the universal tree. Science 284, 2124–2128.

Eldredge, N., and Cracraft, J. (1980). Phylogenetic Patterns and theEvolutionary Process. Columbia Univ. Press, New York.

Ghiselin, M. T. (1969). The Triumph of the Darwinian Method. Univ. of California Press, Berkeley. (1984 reprint by Univ. of Chicago Press, Chicago)

Gilmour, J. S. L. (1940). Taxonomy and philosophy. In The New Systematics ( J. Huxley, Ed.), pp. 461–474. Oxford Univ. Press, Oxford.

Hennig, W. (1966). Phylogenetic Systematics. Univ. of IllinoisPress, Urbana.

Huelsenbeck, J. P., and Crandall, K. A. (1997). Phylogeny estimation and hypothesis testing using maximum likelihood. Annu. Rev. Ecol. Syst. 28, 437–466.

Hull, D. L. (1965). The effect of essentialism on taxonomy—Two thousand years of stasis. Br. J. Philos. Sci. 15, 314–326; 16, 1–18.

Kitching, I. J., Forey, P. L., Humphries, C. J., and Williams, D. M. (1998). Cladistics, 2nd ed. Oxford Univ. Press, Oxford.

Margulis, L., and Sagan, D. (1984). Evolutionary origins of sex. OxfordSurv. Evol. Biol. 1, 16–47.

Mayr, E. (1963). Animal Species and Evolution. Harvard Univ. Press, Cambridge, MA.

Mayr, E. (1969). Principles of Systematic Zoology. McGraw-Hill, New York.

Mayr, E. (1974). Cladistic analysis or cladistic classification? Z. Zool.Syst. Evol. 12, 94–128.

Minelli, A. (1993). Biological Systematics. The State of the Art. Chap- man & Hall, London.

Nelson, G. (1989). Species and taxa: Systematics and evolution. In Speciation and Its Consequences (D. Otte and J. A. Endler, Eds.), pp. 60–81. Sinauer, Sunderland, MA.

Panchen, A. L. (1992). Classification, Evolution, and the Nature ofBiology. Cambridge Univ. Press, Cambridge, UK.

Patterson, C. (1980). Cladistics. Biologist 27, 234–240.Patterson, C. (1982, April). Cladistics and classification. New Scientist

29, 303–306.Pimentel, R. A., and Riggins, R. (1987). The nature of cladistic data.

Cladistics 3, 201–209.Rieppel, O. C. (1988). Fundamentals of Comparative Biology. Birk-

ha¨ user Verlag, Basel.Scoble, M. (1992). The Lepidoptera. Oxford Univ. Press, Oxford.Simpson, G. G. (1961). Principles of Animal Taxonomy. Columbia

Univ. Press, New York.Sneath, P. H. A., and Sokal, R. R. (1973). Numerical Taxonomy.

The Principles and Practice of Numerical Classification. Freeman, San Francisco.

Strong, E. E., and Lipscomb, D. (1999). Character coding and inappli- cable data. Cladistics 15, 327–362.

Wiens, J. J. (1999). Polymorphism in systematics and comparative biology. Annu. Rev. Ecol. Syst. 30, 327–362.