Beyond Simple Homology Searches: Multiple Sequence Alignments and Phylogenetic Trees

Rebecca A. Zufall1

1 University of Houston, Houston, Texas
Publication Name:  Current Protocols Essential Laboratory Techniques
Unit Number:  Unit 11.3
DOI:  10.1002/cpet.9
Online Posting Date:  May, 2017
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


Phylogenetic trees represent hypotheses about evolutionary relationships between organisms or nucleotide or amino acid sequences. Because the best BLAST hit often does not represent the most closely related sequence, phylogenetic analyses are an essential extension of inquiry into any new protein or gene. In this unit, the reader will first learn how to create a multiple sequence alignment and then will learn how to use that alignment to build a neighbor‐joining phylogeny using the software program MEGA. Finally, the user will learn how to interpret a phylogeny in light of the research questions. © 2017 by John Wiley & Sons, Inc.

Keywords: phylogeny; alignment; neighbor‐joining; homology; MEGA

PDF or HTML at Wiley Online Library

Table of Contents

  • Overview and Principles
  • Strategic Planning
  • Basic Protocol 1: Creating a Multiple Sequence Alignment
  • Basic Protocol 2: Making a Phylogenetic Tree
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

   Baum, D. A. and Smith, S. D. 2012. Tree Thinking: An Introduction to Phylogenetic Biology. W. H. Freeman, San Francisco.
   Baum, D.A., Smith, S.D., and Donovan, S.S.S. 2005. Evolution: The tree‐thinking challenge. Science 310:979‐980.
   Chang, W.‐J., Zaila, K.E., and Coppola, T.W. 2016. Submitting a sequence to GenBank. Curr. Protoc. Essen. Lab. Tech. 12:11.2.1‐11.2.24. doi: 10.1002/9780470089941.et110.2s12.
   Des Marais, D.L. and Rausher, M.D. 2008. Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454:762‐765.
   Eisen, J.A. 1998. Phylogenomics: Improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 8:163‐167.
   Felsenstein, J. 2004. Inferring Phylogenies. Sinauer Associates, Inc., Sunderland, Mass.
   Gregory, T. R. 2008. Understanding evolutionary trees. Evo. Edu. Outreach 1:121‐137.
   Hall, B.G. 2011. Phylogenetic Trees Made Easy: A How‐to Manual, 4th Edition. Sinauer Associates, Inc., Sunderland, Mass.
   Hasegawa, M., Kishino, H., and Yano, T. 1985. Dating of the human‐ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160‐174.
   Hayakawa, T., Culleton, R., Otani, H., Horii, T., and Tanabe, K. 2008. Big bang in the evolution of extant malaria parasites. Mol. Biol. Evol. 25:2233‐2239.
   Jukes, T.H. and Cantor, C.R. 1969. Evolution of protein molecules. In Mammalian Protein Metabolism (H.N. Munro, ed.), pp. 21‐132. Academic Press, New York.
   Kumar, S., Stecher, G., and Tamura, K. 2016. MEGA7: Molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 33:1870‐1874.
   Landan, G. and Graur, D. 2009. Characterization of pairwise and multiple sequence alignment errors. Gene 441:141‐147
   Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., and Higgins, D.G. 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23:2947‐2948.
   McHardy, A.C. and Rigoutsos, I. 2007. What's in the mix: Phylogenetic classification of metagenome sequence samples. Curr. Opin. Microbiol. 10:499‐503.
   Mu, J., Joy, D.A., Duan, J., Huang, Y., Carlton, J., Walker, J., Barnwell, J., Beerli, P., Charleston, M.A., Pybus, O.G., and Su, X. 2005. Host switch leads to emergence of Plasmodium vivax malaria in humans. Mol. Biol. Evol. 22:1686‐1693.
   Posada, D. and Crandall, K. 1998. MODELTEST: Testing the model of DNA substitution. Bioinformatics 14:817‐818.
   Stover, N. A. and Cavalcanti, A. R. 2017. Using NCBI BLAST. Curr. Protoc. Essen. Lab. Tech. 11:11.1.1‐11.1.35.
   Tamura, K. and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512‐526.
   Thompson, J.D., Plewniak, F., and Poch, O. 1999. A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27:2682‐2690.
   Zufall, R. and Rausher, M. 2004. Genetic changes associated with floral adaptation restrict future evolutionary potential. Nature 428:847‐850.
   Zwickl, D.J. and Hillis, D.M. 2002. Increased taxon sampling greatly reduces phylogenetic error. Syst. Biol. 51:588‐598.
Key References
   Baum and Smith, 2012. See above.
  An in‐depth introduction to understanding and interpreting phylogenetic trees. Recommended for those who want a better conceptual understanding of how phylogenetic trees can be applied in biology.
   Hall, 2011. See above.
  A “cookbook” for phylogenetic reconstruction. Recommended for beginning users interested in learning parsimony and maximum likelihood methods, in addition to distance methods. Provides further guidance on using MEGA.
   Felsenstein, 2004. See above.
  A comprehensive guide to phylogenetic methodology and application. Recommended for those who want to delve deeply into the subject of phylogenetic inference.
   Graur, D. 2016. Molecular and Genome Evolution. Sinauer Associates, Inc., Sunderland, Mass.
  Recommended reading for an understanding of the evolutionary biology behind the methods of phylogenetic reconstruction.
PDF or HTML at Wiley Online Library