Beyond Simple Homology Searches: Multiple Sequence Alignments and Phylogenetic Trees
1University of Houston, Houston, Texas
Abstract
Phylogenetic trees represent hypotheses about evolutionary relationships between organisms or nucleotide or amino acid sequences. Because the best BLAST hit often does not represent the most closely related sequence, phylogenetic analyses are an essential extension of inquiry into any new protein or gene. In this unit, the reader will first learn how to create a multiple sequence alignment using ClustalX. He or she will then learn how to use that alignment to build a neighbor-joining phylogeny using the program Geneious. Finally, the user will learn how to interpret the phylogeny in light of the research questions. Curr. Protoc. Essential Lab. Tech. 1:11.3.1-11.3.17. © 2009 by John Wiley & Sons, Inc.
Keywords: phylogeny; alignment; neighbor-joining; homology; ClustalX; Geneious
Figures
-
Figure 11.3.1This phylogeny shows the evolutionary relationships between four extant species. The nodes labeled A, B, and C represent most recent common ancestors: A, common ancestor of all four species; B, common ancestor of human, chimp, and mouse; C, common ancestor of human and chimp.
-
Figure 11.3.2This phylogeny shows the relationship between various orthologs and paralogs of a gene. Prior to the divergence of humans and chimps, this gene underwent a gene-duplication event (indicated by the horizontal bar). That gene duplication event resulted in two paralogs: A and B. Speciation between humans and chimps resulted in the orthologs human A and chimp A, and the orthologs human B and chimp B.
-
Figure 11.3.3Phylogeny showing the relationships between taxa 1, 2, and 3 as a polytomy, representing unresolved relationships between these taxa.
-
Figure 11.3.4Results of a GenBank query for cytochrome oxidase I sequences from select Plasmodium species. This search returns several full-length mitochondrial sequences. The genes of interest can be extracted from these sequences as described in the text.
-
Figure 11.3.5The Geneious interface. Across the top is the toolbar. The left panel shows folders of the local documents and links to NCBI database searches. The right panel contains a tutorial and Help files. The top panel shows the contents of the selected documents folder. The bottom panel is the sequence/tree viewer. To the right in the bottom panel are a series of options that allow you to change the way you view the sequences. This screenshot shows the alignment of Plasmodium sequences that were imported from ClustalX.
-
Figure 11.3.6Alignment of sequences of the cytochrome oxidase I gene from seven species of the malaria parasite Plasmodium. Sequences were loaded into ClustalX from FASTA-formatted files, and aligned using the default parameters. Sequence positions in each column are hypothesized to have positional homology. Shown is the first 130 bp of the alignment.
-
Figure 11.3.7Tree building options in Geneious. To build a phylogeny as described in the text, select HKY as the distance model, neighbor-joining as the tree building method, and an outgroup (if you have one) from the list of sequences in the alignment.
-
Figure 11.3.8Neighbor-joining phylogeny of seven species of Plasmodium based on the cytochrome oxidase I gene shown in the Tree viewer panel of Geneious. You can return to the alignment used to build this phylogeny by clicking on Alignment View at the top of the panel.
-
Figure 11.3.9Tree building options in Geneious. To bootstrap a phylogeny, select the box labeled Resample tree.
-
Figure 11.3.10The bootstrapped phylogeny of Plasmodium species. Bootstrap support for each set of relationships is shown to the right of the node where two lineages diverge.
-
Figure 11.3.11BLAST search of I. quamoclit DFR*. The results of this BLAST search will be used to select sequences to use in building a phylogeny to assess the function of DFR*.
-
Figure 11.3.12Alignment of DFR sequences. The first three sequences are much longer than the other sequences. These sequences must be trimmed, and the alignment repeated, before using them in a phylogenetic analysis.
-
Figure 11.3.13Alignment of DFR sequences imported into Geneious.
-
Figure 11.3.14Bootstrapped phylogeny of DFR shows that DFR* is most closely related to related species' DFR-B. Thus, we can hypothesize that the function of DFR* is most like the function of DFR-B.
-
Figure 11.3.15Sequence alignment with an unrealistic number of gaps.
Literature Cited
| Literature Cited | |
| Baum, D.A., Smith, S.D., and Donovan, S.S.S. 2005. Evolution: The tree-thinking challenge. Science 310:979-980. | |
| Des Marais, D.L. and Rausher, M.D. 2008. Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454:762-765. | |
| Drummond, A.J., Ashton, B., Cheung, M., Heled, J., Kearse, M., Moir, R., Stones-Havas, S., Thierer, T., and Wilson, A. 2008. Geneious v4.0. http://www.geneious.com/. | |
| Eisen, J.A. 1998. Phylogenomics: Improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 8:163-167. | |
| Felsenstein, J. 2004. Inferring Phylogenies. Sinauer Associates, Inc., Sunderland, Mass. | |
| Hall, B.G. 2004. Phylogenetic Trees Made Easy: A How-To Manual, 2nd Edition. Sinauer Associates, Inc., Sunderland, Mass. | |
| Hall, B.G. 2007. Phylogenetic Trees Made Easy: A How-to Manual, 3rd Edition. Sinauer Associates, Inc., Sunderland, Mass. | |
| Hasegawa, M., Kishino, H., and Yano, T. 1985. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160-174. | |
| Hayakawa, T., Culleton, R., Otani, H., Horii, T., and Tanabe, K. 2008. Big bang in the evolution of extant malaria parasites. Mol. Biol. Evol. 25:2233-2239. | |
| Jukes, T.H. and Cantor, C.R. 1969. "Evolution of protein molecules". In Mammalian protein metabolism (H.N. Munro, ed.), pp. 21-132. Academic Press, New York. | |
| Landan, G. and Graur, D. 2008. Characterization of pairwise and multiple sequence alignment errors. Gene. Epub June 3, 2008. | |
| Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., and Higgins, D.G. 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23:2947-2948. | |
| McHardy, A.C. and Rigoutsos, I. 2007. What's in the mix: Phylogenetic classification of metagenome sequence samples. Curr. Opin. Microbiol. 10:499-503. | |
| Mu, J., Joy, D.A., Duan, J., Huang, Y., Carlton, J., Walker, J., Barnwell, J., Beerli, P., Charleston, M.A., Pybus, O.G., and Su, X. 2005. Host switch leads to emergence of Plasmodium vivax malaria in humans. Mol. Biol. Evol. 22:1686-1693. | |
| Posada, D. and Crandall, K. 1998. MODELTEST: Testing the model of DNA substitution. Bioinformatics 14:817-818. | |
| Tamura, K. and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512-526. | |
| Thompson, J.D., Plewniak, F., and Poch, O. 1999. A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27:2682-2690. | |
| Zufall, R. and Rausher, M. 2004. Genetic changes associated with floral adaptation restrict future evolutionary potential. Nature 428:847-850. | |
| Zwickl, D.J. and Hillis, D.M. 2002. Increased taxon sampling greatly reduces phylogenetic error. Syst. Biol. 51:588-598. | |
| Key References | |
| Hall, 2007. See | |
| A cookbook for phylogenetic reconstruction. Recommended for beginning users interested in learning parsimony and maximum likelihood methods, in addition to distance methods. Relies largely on the program MEGA. | |
| Felsenstein, 2004. See | |
| A comprehensive guide to phylogenetic methodology and application. Recommended for those who want to delve deeply into the subject of phylogenetic inference. | |
| Graur, D. and Li, W. 2000. Fundamentals of Molecular Evolution. Sinauer Associates, Inc., Sunderland, Mass. | |
| Recommended reading for an understanding of the evolutionary biology behind the methods of phylogenetic reconstruction. | |
Troubleshooting Tips
|
TOOLS & CALCULATORS |





Join the Conversation
Post new comment