Using RAxML to Infer Phylogenies

Alexandros Stamatakis1

1 Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 6.14
DOI:  10.1002/0471250953.bi0614s51
Online Posting Date:  September, 2015
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Inference of phylogenetic trees under the maximum likelihood (ML) criterion represents a routine task in biological data analysis. In this unit we describe how to plan analyses and use Randomized Accelerated Maximum Likelihood (RAxML) for phylogenetic inferences under ML, how to infer support values using the standard bootstrap procedure as well as other statistical measures, and how to conduct post‐analyses on collections/sets of phylogenetic trees including statistical significance tests and consensus tree methods. We also discuss what measures can be taken and what further analyses can be conducted when relationships in the inferred tree exhibit “low” support. © 2015 by John Wiley & Sons, Inc.

Keywords: phylogenetics; maximum likelihood; bootstrap support; consensus tree methods

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Strategic Planning
  • Basic Protocol 1: Determining the Tree Search Parameters
  • Alternate Protocol 1: Getting an Approximate ML Tree Quickly
  • Basic Protocol 2: Inferring ML Trees
  • Basic Protocol 3: Inferring Support Values
  • Alternate Protocol 2: Calculating Alternative Support Measures
  • Guidelines for Understanding Results
  • Commentary
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

Literature Cited
  Aberer, A.J., Krompass, D., and Stamatakis, A. 2013. Pruning rogue taxa improves phylogenetic accuracy: An efficient algorithm and webservice. Syst. Biol. 62:162‐166. doi: 10.1093/sysbio/sys078.
  Anisimova, M. and Gascuel, O. 2006. Approximate likelihood‐ratio test for branches: A fast, accurate, and powerful alternative. Syst. Biol. 55:539‐552. doi: 10.1080/10635150600755453.
  Berger, S.A., Krompass, D., and Stamatakis, A. 2011. Performance, accuracy, and web server for evolutionary placement of short sequence reads under maximum likelihood. Syst. Biol. 60:291‐302. doi: 10.1093/sysbio/syr010.
  Felsenstein, J. 1981. Evolutionary trees from dna sequences: A maximum likelihood approach. J. Mol. Evol. 17(6):368‐376. doi: 10.1007/BF01734359.
  Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783‐791. doi: 10.2307/2408678.
  Felsenstein, J. 2004. Inferring Phylogenies. Sinauer Associates, Sunderland, Mass.
  Flouri, T., Izquierdo‐Carrasco, F., Darriba, D., Aberer, A., Nguyen, L.‐T., Minh, B., von Haeseler, A., and Stamatakis, A. 2015. The phylogenetic likelihood library. Syst. Biol. 64:356‐362. doi: 10.1093/sysbio/syu084.
  Goldman, N., Anderson, J.P., and Rodrigo, A.G. 2000. Likelihood‐based tests of topologies in phylogenetics. Syst. Biol. 49:652‐670. doi: 10.1080/106351500750049752.
  Guindon, S., Dufayard, J.‐F., Lefort, V., Anisimova, M., Hordijk, W., and Gascuel, O. 2010. New algorithms and methods to estimate maximum‐likelihood phylogenies: Assessing the performance of phyml 3.0. Syst. Biol. 59:307‐321.
  Huson, D.H. and Scornavacca, C. 2012. Dendroscope 3: An interactive tool for rooted phylogenetic trees and networks. Syst. Biol. 61:1061‐1067. doi: 10.1093/sysbio/sys062.
  Kozlov, A.M., Aberer, A.J., and Stamatakis, A. 2015. Examl version 3: A tool for phylogenomic analyses on supercomputers. Bioinformatics. [ePub ahead of print].
  Lakner, C., Van Der Mark, P., Huelsenbeck, J.P., Larget, B., and Ronquist, F. 2008. Efficiency of markov chain monte carlo tree proposals in bayesian phylogenetics. Syst. Biol. 57:86‐103. doi: 10.1080/10635150801886156.
  Lartillot, N. and Philippe, H. 2004. A bayesian mixture model for across‐site heterogeneities in the amino‐acid replacement process. Mol. Biol. Evol. 21:1095‐1109. doi: 10.1093/molbev/msh112.
  Minh, B.Q., Nguyen, M.A.T., and von Haeseler, A. 2013. Ultrafast approximation for phylogenetic bootstrap. Mol. Biol. Evol. 30:1188‐1195. doi: 10.1093/molbev/mst024.
  Nguyen, L.‐T., Schmidt, H.A., von Haeseler, A., and Minh, B.Q. 2015. IQ‐Tree: A fast and effective stochastic algorithm for estimating maximum‐likelihood phylogenies. Mol. Biol. Evol. 32:268‐274. doi: 10.1093/molbev/msu300.
  Page, R.D. 2003. Introduction to inferring evolutionary relationships. Curr. Protoc. Bioinform. Unit 6.1:6.1.1‐6.1.13.
  Pattengale, N.D., Alipour, M., Bininda‐Emonds, O.R., Moret, B.M., and Stamatakis, A. 2010. How many bootstrap replicates are necessary? J. Comput. Biol. 17:337‐354. doi: 10.1089/cmb.2009.0179.
  Posada, D. 2003. Using MODELTEST and PAUP* to select a model of nucleotide substitution. Curr. Protoc. Bioinform. Unit 6.5:6.5.1‐6.5.14.
  Price, M.N., Dehal, P.S., and Arkin, A.P. 2010. Fasttree 2‐approximately maximum‐likelihood trees for large alignments. PloS One 5:e9490.
  Robinson, D. and Foulds, L.R. 1981. Comparison of phylogenetic trees. Math. Biosci. 53:131‐147. doi: 10.1016/0025‐5564(81)90043‐2.
  Salichos, L., Stamatakis, A., and Rokas, A. 2014. Novel information theory‐based measures for quantifying incongruence among phylogenetic trees. Mol. Biol. Evol. 31:1261‐1271. doi: 10.1093/molbev/msu061.
  Sanderson, M.J., McMahon, M.M., Stamatakis, A., Zwickl, D.J., and Steel, M. 2015. Impacts of terraces on phylogenetic inference. Syst. Biol. [ePub ahead of print].
  Shimodaira, H. 2001. Multiple comparisons of log‐likelihoods and combining nonnested models with applications to phylogenetic tree selection. Commun. Stat. Theory Methods 30:1751‐1772.doi: 10.1081/STA‐100105696.
  Shimodaira, H. and Hasegawa, M. 1999. Multiple comparisons of log‐likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16:1114‐1116. doi: 10.1093/oxfordjournals.molbev.a026201.
  Stamatakis, A. 2006. Phylogenetic models of rate heterogeneity: A high performance computing perspective. In Proceedings of Parallel and Distributed Processing Symposium, 2006. Rhodes, Greece, April 25‐29, 2006. doi: 10.1109/IPDPS.2006.1639535.
  Stamatakis, A. 2014. Raxml version 8: A tool for phylogenetic analysis and post‐analysis of large phylogenies. Bioinformatics 30:1312‐1313. doi: 10.1093/bioinformatics/btu033.
  Stamatakis, A., Ludwig, T., and Meier, H. 2005. RAxML‐III: A fast program for maximum likelihood‐based inference of large phylogenetic trees. Bioinformatics 21:456‐463. doi: 10.1093/bioinformatics/bti191.
  Stamatakis, A., Hoover, P., and Rougemont, J. 2008. A rapid bootstrap algorithm for the raxml web servers. Syst. Biol. 57:758‐771.doi: 10.1080/10635150802429642.
  Yang, Z. 2014. Molecular Evolution: A Statistical Approach. Oxford University Press, Oxford.
  Zwickl, D, “Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion” (Ph.D. diss., University of Texas at Austin, 2006).
Key Reference
  Stamatakis, A. (2014). Raxml version 8: a tool for phylogenetic analysis and post‐analysis of large phylogenies. Bioinformatics 30(9):1312‐1313.
Internet Resources
  https://github.com/stamatak/standard‐RAxML/releases
  Up‐to‐date RAxML code
  http://sco.h‐its.org/exelixis/web/software/raxml/index.html
  RAxML home‐page with additional information and tutorials
  https://groups.google.com/forum/?hl=en#!forum/raxml
  RAxML Google group for obtaining help
  https://github.com/stamatak/BioinfProtocols
  Data and a transcript of the analyses conducted in this unit
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library