Inferring Evolutionary Trees with PAUP*

James C. Wilgenbusch1, David Swofford1

1 Florida State University, Tallahassee, Florida
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 6.4
DOI:  10.1002/0471250953.bi0604s00
Online Posting Date:  February, 2003
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


This unit provides a general description of reconstructing evolutionary trees using PAUP* 4.0. The protocol takes users through an example analysis of mitochondrial DNA sequence data using the parsimony and the likelihood criteria to infer optimal trees. The protocol also discusses searching options available in PAUP* and demonstrates how to import non‐NEXUS formats. Finally, a general discussion is given regarding the pros and cons of the “model‐free” and “model‐based” methods used throughout the protocol.

PDF or HTML at Wiley Online Library

Table of Contents

  • Basic Protocol 1: Using PAUP* to Infer Parsimony Trees from DNA Sequences
  • Alternate Protocol 1: Using PAUP* to Infer a Maximum‐Likelihood Tree from DNA Sequences
  • Support Protocol 1: Using PAUP* to Import Non‐NEXUS Data Files
  • Guidelines for Understanding Results
  • Commentary
  • Figures
  • Tables
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
   Bruno, W.J. and Halpern, A.L. 1999. Topological bias and inconsistency of maximum likelihood using wrong models. Mol. Biol. Evol. 16:564‐566.
   Camin, J.H. and Sokal, R.R. 1965. A method for deducing branching sequences in phylogeny. Evolution 19:311‐326.
   Cavalli‐Sforza, L.L. and Edwards, A.W.F. 1967. Phylogenetic analysis: Models and estimation procedures. Am. J. Hum. Genet. 19:233‐257.
   Chang, J.T. 1996. Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency. Math. Biosci. 137:51‐73l.
   Cox, D.R. and Hinkley, D.V. 1974. Theoretical statistics. Chapman and Hall, London.
   Cunningham, C.W., Zhu, H., and Hillis, D.M. 1998. Best‐fit maximum‐likelihood models for phylogenetic inference: Empirical tests with known phylogenies. Evolution 52:978‐987.
   Efron, B. 1998. R.A. Fisher in the 21st century. Stat. Sci. 13:95‐122.
   Farris, J.S. 1970. Methods for computing Wagner trees. Syst. Zool. 19:83‐92.
   Farris, J.S. 1977. Phylogenetic analysis under Dollo's Law. Syst. Zool. 26:77‐88.
   Farris, J.S., Albert, V.A., Kallersjö, M., Lipscomb, D., and Kluge, A.G. 1996. Parsimony jackknifing outperforms neighbor‐joining. Cladistics 12:99‐124.
   Felsenstein, J. 1978. Cases in which parsimony and compatibility methods will be positively misleading. Syst. Zool. 27:401‐410.
   Felsenstein, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17:368‐376.
   Felsenstein, J. 1985. Confidence limits on phylogeny: An approach using the bootstrap. Evolution 39:783‐789.
   Fitch, W.M. 1971. Toward defining the course of evolution: Minimal change for a specific tree topology. Syst. Zool. 20:406‐416.
   Gaut, B.S. and Lewis, P.O. 1995. Success of maximum likelihood phylogeny inference in the four‐taxon case. Mol. Biol. Evol. 12:152‐162.
   Goldman, N., Anderson, J.P., and Rodrigo, A.G. 2000. Likelihood‐based tests of topologies in phylogenetics. Syst. Biol. 49:652‐670.
   Gu, X., Fu, Y.‐X., and Li, W.‐H. 1995. Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites. Mol. Biol. Evol. 12:546‐557.
   Hall, B. 2001. Phylogenetic Trees Made Easy: A How‐to Manual for Molecular Biologists. Sinauer Associates,, Sunderland, Mass
   Hasegawa, M., Kishino, H., and Yano, T. 1985. Dating the human‐ape split by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160‐174.
   Hendy, M.D. and Penny, D.A. 1989. A framework for the quantitative study of evolutionary trees. Syst. Zool. 38:297‐309.
   Hillis, D.M., Bull, J.J., White, M.E., Badgett, M.R., and Molineux, I.J. 1992. Experimental phylogenetics: Generation of a known phylogeny. Science 255:589‐592.
   Hillis, D.M., Huelsenbeck, J.P., and Swofford, D.L. 1994a. Hobgoblin of phylogenetics?. Nature 369:363‐364.
   Hillis, D.M., Huelsenbeck, J.P., and Cunningham, C.W. 1994b. Application and accuracy of molecular phylogenies. Science 264:671‐677.
   Hillis, D.M., Mable, B.K., and Moritz, C. 1996. Applications of molecular systematics: The state of the field and a look to the future. In Molecular Systematics, 2nd ed. (D.M. Hillis, C. Moritz, and B.K. Mable, eds.), pp. 515‐543. Sinauer Associates, Sunderland, Mass.
   Huelsenbeck, J.P. and Hillis, D.M. 1993. Success of phylogenetic methods in the four‐taxon case. Syst. Biol. 42:247‐265.
   Huelsenbeck, J.P. 1995. Performance of phylogenetic methods in simulation. Syst. Biol. 44:17‐48.
   Kim, J. 1996. General inconsistency conditions for maximum parsimony: Effects of branch lengths and increasing numbers of taxa. Syst. Biol. 45:363‐374.
   Kishino, H. and Hasegawa, M. 1989. Evolution of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170‐179.
   Kishino, H., Miyata, T., and Hasegawa, M. 1990. Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J. Mol. Evol. 30:151‐160.
   Kluge, A.G. and Farris, J.S. 1969. Quantitative phyletics and the evolution of anurans. Syst. Zool. 18:1‐32.
   Lockhart, P.J., Steel, M.A., Hendy, M.D., and Penny, P. 1994. Recovering evolutionary trees under a more realistic model of sequence evolution. Mol. Biol. Evol. 11:605‐612.
   Maddison, D.R., Swofford, D.L., and Maddison, W.P. 1997. NEXUS: An extensible file format for systematic information. Syst. Biol. 46:590‐621.
   Page, R.D. and Holmes, E.C. 1998. Molecular Evolution: A Phylogenetic Approach. Blackwell Science, Oxford, U.K..
   Penny, D. and Hendy, M.D. 1985. Testing methods of evolutionary tree construction. Cladistics 1:266‐272.
   Posada, D. and Crandall, K.A. 1998. MODELTEST: Testing the model of DNA substitution. Bioinformatics 14:817‐818.
   Rogers, J.S. 1997. On the consistency of maximum likelihood estimation of phylogenetic trees from nucleotide sequences. Syst. Biol. 46:354‐357.
   Sanderson, M.J. and Kim, J. 2000. Parametric phylogenetics? Syst. Biol. 49:817‐829.
   Sankoff, D. 1975. Minimal mutation trees of sequences. SIAM J. Appl. Math. 28:35‐42.
   Shimodaira, H. and Hasegawa, M. 1999. Multiple Comparisons of Log‐Likelihoods with Applications to Phylogenetic Inference. Mol. Biol. Evol. 16:1114‐1116.
   Steel, M. 1994. Recovering a tree from the Markov leaf colourations it generates under a Markov model. Appl. Math. Lett. 7:19‐23.
   Sullivan, J. and Swofford, D.L. 1997. Are guinea pigs rodents? The utility of models in molecular phylogenetics. J. Mamm. Evol. 4:77‐86.
   Sullivan, J. and Swofford, D.L. 2001. Should we use model‐based methods for phylogenetic inference when we know assumptions about among‐site rate variation and nucleotide substitution pattern are violated? Syst. Biol. 50:723‐729.
   Swofford, D.L. 2002. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland, Mass.
   Swofford, D.L. and Maddison, W.P. 1987. Reconstructing ancestral character states under Wagner parsimony. Math. Biosci. 87:199‐229.
   Swofford, D.L., Olsen, G.J., Waddell, P.J., and Hillis, D.M. 1996. Phylogenetic inference. In Molecular systematics, 2nd ed. (D.M. Hillis, C. Moritz, and B.K. Mable, eds.). pp. 407‐514. Sinauer Associates, Sunderland, Mass.
   Swofford, D.L., Waddell, P.J., Huelsenbeck, J.P., Foster, P.J., Lewis, P.O., and Rogers, J.S. 2001. Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Syst. Biol. 50:525‐539.
   Templeton, A.R. 1983. Convergent evolution and non‐parametric inferences from restriction fragment and DNA sequence data. In Statistical Analysis of DNA Sequence Data. (B. Weir, ed.) pp. 151‐179. Marcel Dekker, New York.
   Tuffley, C. and Steel, M. 1997. Links between maximum likelihood and maximum parsimony under a simple model of site substitution. Bull. Math. Biol. 59:581‐607.
   Waddell, P.J. and Penny, D. 1996. Evolutionary trees of apes and humans from DNA sequences. In Handbook of symbolic evolution (A.J. Lock and C.R. Peters, eds.) pp. 53‐73. Clarendon Press, Oxford, U.K.
   Yang, Z. 1994a. Estimating the pattern of nucleotide substitution. J. Mol. Evol. 39:105‐111.
   Yang, Z. 1994b. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods. J. Mol. Evol. 39:306‐314.
Internet Resources
  PAUP* Web site.∼paupforum/
  PAUP* technical forum.
  PAUP* information mailing list.
  PAUP* publisher, Sinauer Associates, Inc. Web site.
Key References
   Hillis et al., 1996. See above.
  A general discussion of issues and controversies pertaining to phylogenetic analyses.
   Page and Holmes, 1998. See above.
  An accessible introduction to phylogenetic theory, terminology, and practice.
   Swofford et al., 1996. See above.
  A detailed description of most methods commonly used in phylogenetic inference.
PDF or HTML at Wiley Online Library