Parsimony, Hierarchy, and Biological Implications

Ronald H. Brady

View article as PDF

Introduction

When constructing a cladogram from a data set a researcher will, as a matter of course, select the shortest possible branching diagram that will represent the data, thus minimizing the number of appearances of any character on the cladogram. This “parsimony program” is usually adopted without comment, but when arguments defending or questioning its use are put forward, they almost universally discuss the program in terms of its biological implications — i.e., whether assumptions about the order of nature are involved. It is my purpose, in the following discussion, to demonstrate that the parsimony program as described has no biological implications but is simply a methodological demand of a cladogram. The implications so many writers find here are actually derived from the hierarchical form of the cladogram itself, and the nature of these implications can be properly understood only when their origin is identified.


Hierarchical Order

The notion of hierarchical classification in biology is as old as Aristotle (it may be implied in Plato) and has been standard operating procedure since Linnaeus. The nature of such order is best studied, however, through the interrelation of concepts in general, which through their own inclusions and exclusions evolve into a system of levels polarized between the more inclusive and more exclusive — the general and specific. The polarity mentioned arises from the relation of sameness and difference in conceptual distinctions, values which are never postulated absolutely but only by degrees. A rational distinction between any two entities is also a relation, and implies sameness on a more general level. The chair is not the cat, nor even closely related, but both are extended bodies. By moving upward (toward the more general) we can always discover an inclusive predicate. On the other hand, by moving downward (toward the more specific), we are able to discover the structure of difference — the chair is not a natural object, not an organism … etc. The two directions, toward the more inclusive and the more exclusive, give us the two sets of terms connected with any concept — i.e., the intension and extension.

If we begin with the concept “vertebrate,” for example, we find that it contains the predicates “animal,” “organism,” “extended body” (among others), and that it can be predicated to “mammals,” “marsupials,” “dogs,” “kangaroos,” and so on. Its own predicates are termed its intensions, those qualities comprehended by the term, and the things it may be predicated to, its extensions. The two sets are inversely related — as the one increases the other must decrease. Thus if I desire to classify any specific object — this typewriter, for instance — I must not only define the model of machine but also find means of specifying this individual one, probably by its space-time coordinates. If I succeed, I shall reduce the extension of my classification to one, but the list of intensions will necessarily be very long. Were I to settle for the kind of typewriter, I could stop at the make and model and the list of intensions would shorten, but the extensions would now include every machine made under that model number. For every intension dropped as I move upward I pick up a new group of extensions. Were I to seek the widest possible predicate, say being, I could reduce the intensions to one, but now the extensions would be potentially infinite.

It is obvious, at this point, that each term within a hierarchy has a unique position, and can appear but once. If we enter the pattern at any level, perhaps at “vertebrate” again, we shall find, looking upward, only the intensions comprehended by that term, none of which can be identical with it, and looking down, only the extensions, which must also be distinct. It is the same for any term in a hierarchy, including the pole positions of all inclusive intension and individual extension, which are both singular and therefore unique. After all, the rule of coordination demands that the commonality of any two terms found to share a property be abstracted and superordinated over them, representing their sameness as membership in a group defined by that property. All sameness within multiplicity is thereby represented as group membership, and each term in the hierarchy will represent either a group or a terminal point (individual extension), both functions being singular. In this aspect, of course, resides the power of hierarchical order to produce an exact classification.


The Cladistic Hypothesis: Part One

A cladogram is a schema of hierarchical relations which coordinates the data assigned to it in nested synapomorphies — i.e., by subordination of the less general characteristics to the more general ones. The schema is hypothetical only, a conceptual product brought forward by the researcher in order to represent the data in hierarchical form. (It does not neutrally “represent” data but adds the hypothesis of hierarchical order.) It contains positions defined by the super- and subordinate relations, and comes to represent a particular data set when the characters of that set are assigned to the positions of the schema.

Because it is hypothetical, the cladogram may fail to bring the data set assigned to it into hierarchical form, or in other terms, it may fall short of “perfect fit” (congruence). In practice, perfect fit is manifested when each character in the data set appears but once in the cladogram. When a character appears more than once, however, it violates the demand of the hierarchical form that each position be unique. If the data set cannot be perfectly fit, all cladograms will contain such violations, and the researcher will select the best cladogram by a parsimony criterion designed to minimize the violations.

The formal analysis of the concept of fit or congruence will continue in the next section. For the moment it is necessary to prepare for that analysis by examining the theoretical background of the cladogram in greater detail. I pointed out above that a cladogram is a hypothesis of hierarchy. Individual cladograms are tested by measuring the degree to which they fit the data, but such testing can be done only in the context of two further hypotheses which are not themselves in question during the testing process.

The first of these is the expectation that organic homologies can be ordered hierarchically. Once the decision to use cladograms has been made, the question is not whether the data set can be represented in hierarchical form but which hierarchy best represents it. That the cladist is not asking the former question (whether the data can be so represented) can be seen from the response to incongruent data sets. Since such a set allows no perfect fit it cannot itself possess a hierarchical form, and thus could be a potential falsifier for the hypothesis that the data could be represented hierarchically. The fact that it is not treated in this manner shows that the question to which this falsification could be applied has already been answered. The understanding that hierarchical order can represent biological data is already tacitly contained in the decision to use cladistic analysis (a point that will show far-reaching implications upon further examination).

The second hypothesis that must be in place is the data set itself — that is, the decision to let the data set represent the taxa. It may not do so, of course, for erroneous judgment is certainly possible when identifying characters. The researcher may personally have serious misgivings about any particular data set, and may at any point return to the organisms to check his or her judgment. But when the stage of cladogram construction is reached, the sphere of investigation narrows to the fit between the cladograms and the data; both the hierarchical form and the data set to be so represented must be taken as given or there is no determinate problem. Once this is done, one can test for the best-fitting cladogram.


The Parsimony Program

The problem has now narrowed to the testing of cladistic hypotheses, a procedure completely bounded by a given data set and the hierarchical schemas brought forward to represent it. At this point the source of the data has no more bearing on the problem. The data set, once given and therefore fixed for hierarchical representation, could be derived from linguistics as well as biology, since the problem belongs neither to biology nor to linguistics but to hierarchy theory. The task at hand is the representation of that data set in hierarchical form — the subordination of the less general terms or characters to the more general.

In order to see the dynamic clearly, the reader should imagine that all operations are performed on data sets that represent nothing beyond themselves but have been invented simply for the purpose of providing exercises in hierarchical representation (this approximates the condition that the data set be a given parameter). If any particular data set allows a schema (cladogram) to be perfectly fit to it, the representation in hierarchical form is total. If, on the other hand, an individual data set falls short of this and allows no perfect fit, the representation in hierarchical form will be less than total. That is, since by fit we mean the coordination of all the data in hierarchical form, lack of fit will consist in the appearance of the same character more than once in the schema, which is a violation of the hierarchical form (by the demand of uniqueness discussed above). As these violations accumulate, the fit between the data set and the form must diminish.

When we shift our focus from the schema to the data set, the source of the violations becomes visible. All violations of the form stem from contradictory characters — i.e., characters which place the same entity in two or more mutually exclusive groups. The claim of each character distribution is, of course, that the entity belongs here and nowhere else. Obviously, only one of these claims can be correct. Thus, when we construct the hierarchy we must choose the preferred claim (character distribution) and reject the others. The rejected character distribution will now reappear in the form of the violation above — i.e., the same character will appear in multiple positions (its claim to group definition being rejected). Because the hierarchical form is the only value guiding our choice, the “preferred” claim must be the one that creates the fewest violations of the form, or best fit, since any other choice would carry with it the penalty of departing further from the most complete hierarchical structure to be found within the data. This operation, by which the selection of the least contradictory schema is carried out, is the parsimony program.

The conclusions above follow from the purely formal relations that exist between a data set and a hierarchy schema. The choice of the least contradictory schema follows from the demand of best hierarchical representation — no other consideration being necessary. And because it is merely formal, this same demand will reappear whenever hierarchical schemas are fit to data sets, whatever the field of application.


The Cladistic Hypothesis: Part Two

We are now in a position to trace the more interesting biological implications of the cladogram, an exercise that will eventually lead to considerations of evolutionary theory. Any such examination must begin, however, by pointing out that the parsimony program, as analyzed above, cannot in itself give rise to any implications whatever.

This argument is now quite simply made. The parsimony operation is a purely formal one solving for the best fitting hierarchy schema. Given that (1) we intend to represent the data set hierarchically, and (2) we accept a particular data set, the selection of the schema which represents the set with the fewest violations of hierarchical form is merely logical, and we have no choice in the matter. But since what follows from a given by formal identity (logical necessity) must be already contained within that given, any and all biological implications must be derived from the two parameters mentioned. The parsimony operation adds absolutely nothing to what we have already accepted through the demand of hierarchical representation and the acceptance of a data set. The resultant cladogram is merely the explicit display of the relations that follow from these two decisions.

Because this is the case, the merely formal exercise of hierarchy theory by which a schema is fit to a data set becomes a biological investigation through the same decisions, and the cladogram must be interpreted in that light. It is just at this point that confusion may enter, for if the import of hierarchical representation is not recognized, the parsimony program will seem to be responsible for substantive assumptions. Take, for instance, the notion that the program seems to forbid acceptance of parallelisms, reducing them to a minimum.

The proscription of parallelism is indeed carried out by the parsimony program, but does not originate with it. Parallelisms, when displayed on the cladogram, will represent violations of the hierarchical form — i.e., they are, minimally, “independent” developments of “the same” character which become contradictory adjectives when interpreted within a hierarchical schema. By “independent,” schematically, we mean belonging to different branches (for which branches the character in question is not plesiomorphic); but by “the same,” we imply, at least for hierarchical display, a single branch — for each position on the schema is unique, and all defining elements should appear but once. Thus it is the hierarchical form that proscribes parallelism, and the parsimony program simply carries out the proscription by reducing these violations to a minimum.

Understood in this manner, the cladistic hypothesis does not forbid parallel development, but treats it as unlikely because of the demand of hierarchical representation. But the interpretation can be taken further. The decision to represent taxa hierarchically constitutes, of course, a prediction that they can be so represented. This is an empirical claim, although merely one of pattern, and could be considered falsified by excessive violations of the form. Obviously cladists consider the claim successful, or they would abandon the approach. Once we make the claim, however, that the morphological similarity and difference will allow hierarchical coordination, we imply the existence of a cause sufficient to bring it about. And the operation of this cause, assuming ceteris paribus (i.e., that there are no interfering parameters, which assumption is added for the sake of making a prediction at all), will forbid parallelisms. Characters that violate the hierarchy will be primarily indications of errors in character identification, and only secondarily evidence of parallelism (or reversal — the argument being the same), for we need not consider our ceteris paribus assumption falsified until some further evidence turns up — i.e., until other evidence corroborates the character identification or the pattern of violations, thus indicating that an actual interfering parameter is disturbing the hierarchical pattern.

The hierarchical pattern under discussion is a pattern evident in the phenotype and implies that the process of evolution gives rise, within the formation of homologies, to stable discontinuities coordinated hierarchically. The causal implications of the claim involve not only the origin of pattern but the fixation or stabilization of its defining elements. Both points are needed to form an expectation of hierarchy because the assumption that the origin of discontinuities has been hierarchical does not demand that we could now recover that hierarchy.

Let us assume, for example, that discontinuities originate in a hierarchical fashion. Had we a record of origin, the record would evidence hierarchical coordination. But if we also assume that the same record will be recoverable, we must assume as well either that no losses of defining characters take place over time or that if such losses do take place they do not do so at a rate great enough to destroy the pattern. Generalizing, we may say that the rate of phenotype change cannot be greater than the rate by which defining characters are fixed, otherwise the fixation of characters could not record the changes.

Because taxonomists have had some success in recovering hierarchical pattern from the phenotype, it would seem that the expectation that homologies could be so coordinated has turned out to be viable. For the cladist, the hypothesis of hierarchy has become an empirical claim, and the causal implications that follow upon it cannot be looked upon as artifacts of method, but necessary derivations from a presumably successful claim. What this may mean can be seen more clearly when contrasted with another approach.


Pattern

In the beginning of his Animal Species and Evolution, Ernst Mayr (1963) developed his notion of a “biological species concept” in opposition to what he termed the “typological” concept. The latter, he argued, treated species as “a class of objects,” the members of which were included because they “agree with the diagnosis,” and the “essential properties” of that diagnosis were morphological. His own concept, on the other hand, relegated the morphological evidence to a secondary status. Nature does contain lawful discontinuities, but these are reproductive discontinuities and only by reproductive isolation can a species be identified:

The word “species” indicates a relationship, like the word “brother.” Being a brother is not an inherent property of an individual, as hardness is the property of a stone. An individual is a brother only with respect to someone else. A population is a species only with respect to other populations. To be a different species is not a matter of difference but of distinctness.

Since the morphological pattern in itself will not always reflect reproductive isolation, the discontinuities of this pattern cannot be identified as species distinctions without further interpretation.

Mayr admits, however, that the pre-Darwinian, and even the Darwinian, concepts of species were largely morphological: “Even Darwin, who was more responsible than anyone else for the introduction of population thinking into biology, often slipped back into typological thinking, for instance in his discussions on varieties and species.” But this mistake, Mayr implies, contributed to another mistake, for Darwin shared “the general assumption of the period”, that either species are permanent and cannot originate from one another, or that they do so originate and therefore are but arbitrary distinctions. Thinking of species distinctions as purely morphological, Darwin saw no reason to modify these assumptions.

The implied argument becomes clear upon looking into the chapter on morphology in the Origin. Homology is discussed there in terms of continuity alone, and this turns out to be sufficient for the purposes of the author. Darwin wanted to establish transformation between the distinct forms found in nature. He imagined this continuity in terms of “infinitely numerous modifications” — that is, as near to perfect continuity, or motion, as he could. His argument explains observed discontinuity in terms of channeled transformation rather than stability, and translates the differences into different channels of continuous change. Since this change may take place at any pace, species may seem stable because the pace is so slow, but no further account of this stability is required, and Darwin never seemed to notice that “infinitely numerous modifications” is not a good model for character transformation.

Having established that Darwin himself did not have an adequate species concept, Mayr now lays the groundwork for his own in pointing out that neither modern taxonomists nor New Guinea primitives can miss the pattern of species differentiation between sympatric populations: “The striking discontinuity between sympatric populations is the basis of the species concept in biology.” But if this is the only situation in which the judgment is perfectly clear and common to all, a criterion over and above the morphology must be necessary to move beyond sympatry .This may be found in reproductive isolation, which in the case of sympatric populations is directly reflected in the morphology, but is often obscured by the same evidence elsewhere. Yet even the typologist tacitly admits the standard: “No matter how different a morphological variant may be, as soon as it is revealed as a member of the same breeding population … he sinks it into synonymy.” There can be, then, little ground for disagreement.

Comparing Rosen (1979) to Mayr, it would seem that Mayr has understated the possibilities of opposition. Rosen writes:

Within the framework of the “biological species” concept, zones of secondary intergradation (hybridization) in nature between recognizably different natural populations have been taken by some taxonomists as prima facie evidence that the two populations represent only a single species (Mayr, 1969, p. 195). In the upper parts of the Rio Lacantun and Rio Salinas drainages along the foothills of the Guatemalan Sierras, there appears to be a zone of intergradation between distinct forms within both Xiphophorus and Heterandria. In cladograms of relationships for the Middle American representatives of each group … the alleged intergradation occurs between forms separated in their respective cladograms by two or three branch points.

One might use the biological species concept, continues the author, to conclude that the forms on the intervening branch points are all variants of one species, but this strategy gives rise to a contradiction.

In the case of Xiphophorus, laboratory evidence has established that one of the cladistically intervening species, X. signum, is reproductively incompatible with its closest relative, which in turn is reproductively compatible in nature with a fish only distantly related to both. This leaves but two choices: either the morphology is misleading or the biological species concept is inapplicable. Rosen chooses the latter:

within the history of any lineage, reproductive compatibility is an attribute of the members of the ancestral species of that lineage, an attribute which is gradually diminished and ultimately lost in its descendants during geographic differentiation. In other words, reproductive compatibility is a primitive attribute for the members of a lineage and has, therefore, no power to specify relationship within a genealogical framework.

This choice has its immediate empirical basis in the fact that Rosen finds a clear pattern in the morphology, which should not be overruled for purely theoretical reasons. After all, Mayr had suggested that in many cases no such pattern could be recovered, and used this claim to argue the necessity of an overriding criterion. Cladists do not seem to experience the difficulties Mayr predicts, and one may wonder just how much clear morphological distinction has been sunk in synonymy by an application of the reproductive criterion. In this case it is obvious that the criterion would demote the morphological pattern to noise, shifting the focus of the taxonomist to an entirely different aspect. If Rosen had been one of the faithful, he would have missed the extension of the cladistic pattern discovered when, in the latter part of his monograph, he compares the morphological pattern to the pattern of geographic distribution within the two groups. The congruence that shows up here is an empirical confirmation of the original pattern, and certainly indicates some sort of causal significance. And this is the point of his approach.

The empirical evidence of pattern has, for Rosen, the position that difference had for Darwin — in fact, the pattern is the difference, but seen as a forest rather than the trees. Transformation — the sort of difference Darwin was looking at — had in itself causal implications, but a unified pattern of transformations has more, and thus if the latter is available we must prefer it to the mere fact of transformation. Mayr recognized this, since the pattern of discontinuity within the morphology of sympatric groups is used to establish his disagreement with Darwin, but he quickly forgets it again when he uses the resulting theory to throw out pattern in other groups. He has not, evidently, given full recognition to his starting point. Rosen, by contrast, is very clear that the hierarchical order of transformations (homologies) is the primary datum that constitutes his starting point and guides his investigation.

The primary datum in the sense used above is an observational result (I term it this since it may combine several “observations”) which becomes the central fact to be explained, and thus the starting point of investigation. Pattern becomes such a datum because pattern is the expression of 1aw — i.e., is the organization of data according to a particular idea or rule. Evolutionary theory, at least for Darwin, seems to have been a response to the investigation of common plans in comparative anatomy. The distinction between any two members of one plan had to be conceived in terms of two different formulations of the same thing (plan), or transformation. As this transformation began to look more and more exact, the necessity apparent in the transformation demanded an explanation, for it was clearly the expression of law. But unfortunately, Darwin responded by fixing the primary datum as transformation, and developing a speculative account of processes by which it might be explained. He never saw the full import of hierarchy because once he had begun to think in terms of process he was no longer looking for an expansion of pattern.

The discovery of hierarchical order within the above transformations does not invalidate any of the original observations but simply expands them by bringing them into a new relationship. We now look at the pattern of homologies writ large, but because it is still the same pattern, we are still dealing with the same law. Obviously, the expansion has been informative, and our grasp of the law expressed through the pattern has become more specific. (We are able to reject, for example, all speculations that do not account for the fixation of defining characters of the hierarchy.) This mode of specification — by an extension of the original pattern — remains entirely within the realm of empirical investigation, increasing our knowledge of the problem to be investigated rather than adding to the list of speculative solutions. The biologist for whom such speculation is the chief pastime of science may fault this approach for a sin of omission, but once we decide that we have a pattern to explain, it would be rather difficult to argue that we need not come to know it better.

Even so, the most recent expansion of the primary datum of phylogenetic theory comes from the observations of vicariance biogeography (Croizat, 1964). Croizat’s notion that the earth and life have evolved together was his theoretical response to a pattern which unified the two. The use of cladograms to investigate that pattern is but an addition of methodology adequate to the task. The additional specifications of causal implication that may eventually result from this step are, for the moment, largely unknown, but the expansion has already enlarged the pattern under examination by a quantum leap.

Upon reflection, the use of speculative mechanisms as a tool of investigation seems to necessitate a certain compromise of observational freedom. The investigator looking for preconceived relations is unlikely to detect new ones. The invisibility of vicariance patterns to a mind fixed upon dispersal is a recent and telling example. Croizat had to return to the study of pattern per se, prior to the formulation of a theory of process, before he could see the relationships (common tracks) which are the basis of vicariance theory. The demand that an observer must be emancipated from speculations of process in order to detect pattern, however, suggests a methodological opposition between two forms of science.


Observation and Theory

Most scientists and philosophers of science wil1 readily admit that “observation is theory-laden,” but the claim has become so ubiquitous that it may not occasion much thought. I mean that it may not lead to any reflection on how observation becomes theory-laden, and therefore on the different forms this loading may take. If such reflection is not forthcoming, however, the opposition discovered above cannot be clearly understood, for it arises from the alternative ways in which theory may be present in observation.

The notion that observation must be innocent of theory is probably a legacy, at least for English speakers, of the work of Sir Francis Bacon, whose version of science might be termed “naive inductionism” — i.e., the assumption that one may gather data neutrally and then find a theory by inductive generalization on that data. Darwin wrote within an intellectual climate that accepted this fiction, and was forced to cast his rhetoric in this form, claiming in his Autobiography that he “worked on true Baconian principles and without any theory collected facts on a wholesale scale.” But as David Hull (1973; for all Darwin quotations see pp. 8-9) has demonstrated, Darwin knew better. He wrote to a friend in 1861:

About thirty years ago there was much talk that geologists ought only to observe and not to theorize; and I well remember someone saying that at this rate a man might as well go into a gravel-pit and count the pebbles and describe the colours. How odd it is that anyone should not see that all observation must be for or against some view if it is to be of any service!

Observation must be guided by something or it becomes a useless ramble. Only textbook writers still seem to miss this lesson, and by now the Baconian investigator is a straw man whom no one defends. Some find, however, that naive inductionism is a useful accusation to aim at others.

When Hull discusses the intellectual background of Darwin’s period, he remarks

that one of the most prevalent confusions in the work of even the best scientists and philosophers was between the temporal order of an actual scientific investigation and the logical order of a reconstruction of scientific method. According to the empiricist epistemology, all knowledge has its foundation in experience. This tenet was mistakenly taken to imply that all scientific investigation has to begin with observation. The true inductive scientist began collecting data indiscriminately, with no preconceived idea, and gradually evolved broader and broader generalizations.

From such an analysis, one might easily mount an accusation of Baconian naiveté against any attempt to begin investigation from observation without a preexistent theory, for the inductivist mistake is made to consist of just this intention. By this analysis, of course, Croizat’s work is either misrepresented above or is a happy, but unrepeatable, accident. We cannot begin without a speculative bias to guide our observations.

But compare Edmund Husserl (1962) on the same subject. Dealing with the “mistake”, inherent in the nineteenth-century empiricist epistemology, Husserl writes:

The fundamental defect of the empiricist argument lies in this, that the basic requirement of a return to the “facts themselves” is identified or confused with the requirement that all knowledge shall be grounded in experience.

By experience, Husserl explains in a longer discussion, we mean something essentially different from thought. The mistake, he argues, is not that one supposes that investigation must begin from observation, but that observation is supposed to be something different from theory. In contrast to this, Husserl argues that to be “without bias” should not mean “without theory,” but only “without preconceived theory,” and that the “facts themselves” will contain theory if we have found the correct facts. “Genuine science,” he concludes, is distinguished by “a genuine absence of bias.” We must discover the beginning point of investigation in an act of observation free from prior theoretical speculation.

That Husserl is not a naive inductionist can be seen from his long critique of that position. But that he does not recommend the sort of science followed by Darwin or Mayr, or recommended by Hull, is equally clear. If we examine the opposition in a bit more detail, the theoretical basis of cladistics as an empiricism will become apparent, for it is an example of the approach Husserl is defending.

The “theory-laden” quality of observation is, for Hull, the effect of the theoretical context in which the observation is made. An experiment, for example, is only conceivable on the basis of a preexistent theory, for without a theoretical framework how can we have expectations to test? And if a preconceived theory of process is necessary for the construction of an experiment, is not the same argument generalizable for all empirical observation? Must we not have a reason for looking where we are looking, for recording this fact but ignoring that one? Of course, the answer is yes — but Hull sets up a false dichotomy when he supposes that the only alternatives here are naive inductionism and his formulation of a hypothetico-deductive approach. The dichotomy is maintained through the supposition that thought is not contained in observation, but must be added to it (by a preconceived hypothesis). If we accept this supposition, the only way the investigator may come to any theoretical content is by a speculative addition to observation.

Against this consider the way in which Newton’s principle of inertia is already contained in the pattern of behavior of inertial objects — of which billiard balls form a very clear example. Even the casual observer of the behavior of billiard balls cannot help but notice a certain regularity in this — i.e., it becomes obvious that if we begin with the balls at rest, no movement is seen except in conjunction with a prior movement of another object (ball or cue stick). This constant relationship within a multiplicity of different events is conceptually detected — one must think it in order to notice it, since a relationship is not a sense percept but a concept — yet we have the sense that we have seen it in the behavior of the objects before us. It is indeed a sameness that we notice through observation, even though it is at the same time the appearance of law. When formulated it becomes a specific appearance of the principle of inertia: no change in the state of motion or rest is possible except through the addition of an impressed force. This principle is simply the verbal formulation of the law detected within the pattern of appearances, the concept by which things look “inertial” to us. (There have been many complaints about the verbal formulation, which is only to be expected. Husserl points out that the sort of concept that we find in experience does not fully give itself to verbal definition. We cannot understand Newton’s formulation without knowing the appearances to which it refers, and by which it must be qualified.)

Once a lawful regularity is seen in nature, we are naturally attracted to that observation as a starting point of investigation. Indeed, we would probably not attempt to discover law within appearances at all did not those very appearances continually reveal the imprint of pattern, and therefore governing law.

Of course, only certain observations contain such a starting point, and without these the problem for investigation, if we are to have a problem, must be manufactured by a speculative hypothesis. Thus, in the Darwinian and neo-Darwinian approaches, while the general question may be the origin of observed pattern, the research program does not investigate this directly but confines itself to a problem-solving activity consisting of nothing more than the relation of local hypotheses to the framework hypothesis. The question becomes: “If I assume that my hypothetical framework is true, how do I explain this observation — what local hypothesis should I offer for test?” (For a full analysis of this format, see Michod, 1981.) The question of the research program that Husserl is describing is rather: “If I am grasping this regularity correctly, what observations should reveal more of the pattern?” When these observations are made, they corroborate or call into question, not our local hypothesis, but our framework question: “What is the law that I am seeing?” The problem is not the relations between framework hypothesis and local hypothesis, but the relation between the idea in the mind and the lawfulness observed in nature. We do not care to investigate the structure of our theory, but rather that of the world.

If this empiricist approach is understood correctly (I call it this to differentiate it from the manifestly speculative approach to which I have compared it), the problem becomes: “What is the emergent law?” or, from the point of view of the investigator, “How do I think the relation I have already begun to see?” Because the emergent relation is successively revealed as the pattern is successively revealed, hypothetical extension of the relation must continually be tested by actual extensions — i.e., we test our conceptual extension of the pattern by actual observations. Real progress can be measured only by the extension of pattern in observation. The history of phylogenetic theory, for empiricist science, consists in the movement from the primary datum of transformation (homology within a common plan) to that of the same transformation seen in hierarchical form, and then to the same transformation seen reflected in the transformations of the surface of the earth. Because the problem was always the emergent law, each expansion of pattern is also part of the solution, for it represents a progressive emergence. Obviously, the law is complete only when prediction is complete. We have no difficulty seeing that billiard balls, at least on the level of unaided observation, present no surprises — our grasp of law is complete. When our grasp of phylogenetic law reaches the same development, we shall detect completion through the same clarity. All this takes place without the introduction of speculations of mechanism or process. If this comes as a surprise, it can only be that the reader for whom this is true has conceived a different problem. If we investigate emergent law, we have no use for such speculation. If, on the other hand, we make the problem one of process, we have formulated another question.


Conclusion

The hierarchical pattern detected by cladists cannot be treated as a mere assertion of a point of view — too many workers from too many different theoretical persuasions have recovered it. But once this is realized, the detection of “biological implications” that follow can hardly be accepted as a criticism. We are now in a position, in fact, to see these implications as the central strength of taxonomy.

Let me review the stages of argument by which this discussion has progressed. I began with a consideration of the parsimony program, arguing that, as a purely formal operation, its only biological content was that contained in the demand of hierarchical coordination and the particular data set. The biological implications now pass to the two items mentioned. The acceptance of a particular data set is dependent upon investigations that I do not discuss here, but once it has been accepted it becomes a given for hierarchical coordination, and the remaining biological content rests with the notion of hierarchy. But this idea, which clearly contains causal implications, is itself an empirical claim — that is, we propose that homologies are hierarchical, and we need only recover that order through cladistic treatment or ontogeny to “observe” it in nature. The sense of “observation” here is hardly an immediate one, but it constitutes an empirical claim nevertheless, and could be falsified by a failure to find hierarchical order. At this point, causal implications which some mistakenly attribute to the parsimony program turn out to be the implications inherent in a primary datum: the order of homologies observed in nature. This is an observation that provides a starting point because it implies law, and is thus the ground of taxonomy.

That starting point may be further expanded through the evidence of vicariance biogeography, or discoveries yet to come, and each expansion reemphasizes the approximate nature of any particular stage of recognition. The “same” pattern undergoes transformation in our perception as new evidence is added. The real progress of taxonomic theory, however, now appears to be measured in terms of these pattern transformations, a form of investigation to which cladistics is clearly directed. The efficient application of cladistic methodology in the future may well depend upon how clearly this point is realized by its practitioners, and how firmly the primary datum is conceived. After all, the sort of objections to the implications of hierarchical pattern discussed above take their origin in a consideration of mechanisms rather than pattern and, as such, appear to be an import from a speculative mode of science.


Ronald H. Brady taught in the school of American Studies at Ramapo College in Mahwah, New Jersey. This article was originally published in Advances in Cladistics, vol. 2, Proceedings of the Second Meeting of the Willi Hennig Society, edited by Norman I. Platnick and V. A. Funk, pp. 49-60. New York: Columbia University Press, 1983.


Acknowledgments

I would like to thank Norman Platnick and Donn Rosen for their careful reading, suggestions, and corrections, and Steve Farris and Gareth Nelson for their conversations on the subject.


References

Croizat, L. (1964). Space, Time, Form: The Biological Synthesis. Caracas: published by the author.

Hull, D. (1973). Darwin and His Critics. Cambridge MA: Harvard University Press.

Husserl, E. (1962).  Ideas. New York: Collier Books.

Mayr, E. (1963). Animal Species and Evolution. Cambridge MA: Harvard University Press.

Michod, R. E. (1981). “Positive Heuristics in Evolutionary Biology,” British Journal for the Philosophy of Science vol. 32, pp. 1-36.

Rosen, D. E. (1979). “Fishes from the Uplands and Intermontane Basins of Guatemala: Revisionary Studies and Comparative Biogeography,” Bulletin of the American Museum of Natural History vol. 162, pp. 267-376.

 
Seth Jordan