Computing Multiple Sequence/Structure Alignments with the T‐Coffee Package

Cedric Notredame1

1 Centre de Regulació Genòmica, Barcelona, Spain
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 3.8
DOI:  10.1002/0471250953.bi0308s29
Online Posting Date:  March, 2010
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

In this unit, we describe assembly of a multiple sequence alignment using the T‐Coffee package. T‐Coffee is much more flexible than most related methods (e.g., ClustalW) because it makes it possible to combine many alternative alignments into a single one, based on an estimate of consistency between these alignments. This strategy can be especially useful when one has to decide among the output produced by several alternative methods. Curr. Protoc. Bioinform. 29:3.8.1‐3.8.25. © 2010 by John Wiley & Sons, Inc.

Keywords: sequence alignment; multiple sequence alignment; T‐Coffee

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Computing Multiple Sequence Alignments
  • Basic Protocol 2: Profile Alignments from Large Data Sets: Aligning Alignments
  • Support Protocol 1: Reformatting Sequences, Alignments, Structures, and Libraries
  • Support Protocol 2: Evaluating the Local Score of an Alignment
  • Support Protocol 3: Generating and Using T‐Coffee Libraries
  • Alternate Protocol 1: Combining and Comparing Alignments
  • Basic Protocol 3: Combining Sequences and Structures
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

   Armougom, F., Moretti, S., Poirot, O., Audic, S., Dumas, P., Schaeli, B., Keduas, V., and Notredame, C. 2006. Expresso: Automatic incorporation of structural information in multiple sequence alignments using 3D‐Coffee. Nucleic Acids Res. 34:W604‐W608.
   Bairoch, A., Bucher, P., and Hofmann, K. 1997. The PROSITE database, its status in 1997. Nucleic Acids Res. 25:217‐221.
   Do, C.B., Mahabhashyam, M.S., Brudno, M., and Batzoglou, S. 2005. ProbCons: Probabilistic consistency‐based multiple sequence alignment. Genome Res. 15:330‐340.
   Duret, L. and Abdeddaim, S. 2000. Multiple alignments for structural, functional or phylolgenetic analyses of homologous sequences. In Bioinformatics: Sequence, Structure and Databanks (D. Higgins and W. Taylor, eds.) pp. 51‐76. Oxford University Press, Oxford.
   Jones, D.T. 1999. Protein secondary structure prediction based on position‐specific scoring matrices. J. Mol. Biol. 292:195‐202.
   Katoh, K., Misawa, K., Kuma, K., and Miyata, T. 2002. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30:3059‐3066.
   Lassmann, T. and Sonnhammer, E.L. 2002. Quality assessment of multiple alignment programs. FEBS Lett. 529:126‐130.
   Moretti, S., Armougom, F., Wallace, I.M., Higgins, D.G., Jongeneel, C.V., and Notredame, C. 2007. The M‐Coffee Web server: A meta‐method for computing multiple sequence alignments by combining alternative alignment methods. Nucleic Acids Res. 35:W645‐W648.
   Morgenstern, B., Frech, K., Dress, A., and Werner, T. 1998. DIALIGN: Finding local similarities by multiple sequence alignment. Bioinformatics 14:290‐294.
   Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Barrell, D., Bateman, A., Binns, D., Biswas, M., Bradley, P., Bork, P., Bucher, P., Copley, R.R., Courcelle, E., Das, U., Durbin, R., Falquet, L., Fleischmann, W., Griffiths‐Jones, S., Haft, D., Harte, N., Hulo, N., Kahn, D., Kanapin, A., Krestyaninova, M., Lopez, R., Letunic, I., Lonsdale, D., Silventoinen, V., Orchard, S.E., Pagni, M., Peyruc, D., Ponting, C.P., Selengut, J.D., Servant, F., Sigrist, C.J., Vaughan, R., and Zdobnov, E.M. 2003. The InterPro database: 2003 brings increased coverage and new features. Nucleic Acids Res. 31:315‐318.
   Ng, P.C. and Henikoff, S. 2002. Accounting for human polymorphisms predicted to affect protein function. Genome Res. 12:436‐446.
   Notredame, C. 2002. Recent progress in multiple sequence alignment: A survey. Pharmacogenomics 3:131‐144.
   Notredame, C. and Abergel, C. 2003. Using multiple alignment methods to assess the quality of genomic data analysis. In Bioinformatics and Genomes (M. Andrade, ed.) pp. 150‐175. Springer Verlag, New York.
   Notredame, C., Higgins, D.G., and Heringa, J. 2000. T‐Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205‐217.
   Phillips, A., Janies, D., and Wheeler, W. 2000. Multiple sequence alignment in phylogenetic analysis. Mol. Phylogenet. Evol. 16:317‐330.
   Ramensky, V., Bork, P., and Sunyaev, S. 2002. Human non‐synonymous SNPs: Server and survey. Nucleic Acids Res. 30:3894‐3900.
   Rausch, T., Emde, A.‐K., Weese, D. Döring, A., Notredame, C., and Reinert, K. 2008. Segment‐based multiple sequence alignment. Bioinformatics 24:i187‐i192.
   Taylor, W.R. and Orengo, C.A. 1989. Protein structure alignment. J. Mol. Biol 208:1‐22.
   Thompson, J.D., Koehl, P., Ripp, R., and Poch, O. 2005. BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark. Proteins 61:127‐136.
   Wallace, I.M., O'Sullivan, O., Higgins, D.G., and Notredame, C. 2006. M‐Coffee: Combining multiple sequence alignment methods with T‐Coffee. Nucleic Acids Res. 34:1692‐1699.
Key References
   Kemena, C. and Notredame, C. 2009. Upcoming challenges for multiple sequence alignment methods in the high‐throughput era. Bioinformatics 25:2455‐2465.
  A recent review on the recent methodological developments of methods implementing template based alignments.
   Moretti, S., Wilm, A., Higgins, D.G., Xenarios, I., and Notredame, C. 2008. R‐Coffee: A Web server for accurately aligning noncoding RNA sequences. Nucleic Acids Res. 36:W10‐W13.
  A paper describing a Web server running a version of T‐Coffee able to align RNA.
   Notredame et al., . See above.
  The original paper describing the T‐Coffee algorithm and the one that should be cited as a reference for T‐Coffee.
   O'Sullivan, O., Suhre, K., Abergel, C., Higgins, D.G., and Notredame, C. 2004. 3DCoffee: Combining protein sequences and structures within multiple sequence alignments. J. Mol. Biol. 340:385‐395.
  A paper describing the combination of sequences and structures.
  Taylor and Orengo, . See above.
  First description of SAP structure‐structure alignment method used by T‐Coffee.
   Wilm, A., Higgins, D.G., and Notredame, C. 2008. R‐Coffee: A method for multiple alignment of non‐coding RNA. Nucleic Acids Res. 36:e52.
  A paper describing a novel algorithm in T‐Coffee for the alignment of RNA sequences.
Internet Resources
  http://www.tcoffee.org
  T‐Coffee home page.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library