Phylogenomic Inference of Protein Molecular Function

Nandini Krishnamurthy1, Kimmen Sjölander1

1 University of California, Berkeley, California
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 6.9
DOI:  10.1002/0471250953.bi0609s11
Online Posting Date:  October, 2005
With the explosion in sequence data, accurate prediction of protein function has become a vital task in prioritizing experimental investigation. While computationally efficient methods for homology‐based function prediction have been developed to make this approach feasible in high‐throughput mode, it is not without its dangers. Biological processes such as gene duplication, domain shuffling, and speciation produce families of related genes whose gene products can have vastly different molecular functions. Standard sequence‐comparison approaches may not discriminate effectively among these candidate homologs, leading to errors in database annotations. In this unit, we describe phylogenomic approaches to reduce the error rate in function prediction. Phylogenomic inference of protein molecular function consists of a series of subtasks. Once a cluster of homologs is identified, a multiple sequence alignment and phylogenetic tree are constructed. Finally, the phylogenetic tree is overlaid with experimental data culled for the members of the family, and changes in biochemical function can be traced along the evolutionary tree.

Keywords: Evolution; Homolog; Ortholog; Paralog; Function prediction; Phylogenomic; Subfamily; Phylogenetic

Table of Contents

  • Basic Protocol 1: Identifying Homologs and Constructing a Multiple Sequence Alignment Using FlowerPower and MUSCLE
  • Basic Protocol 2: Multiple Sequence Alignment Analysis and Editing Using Belvu
  • Support Protocol 1: Downloading and Installing the Belvu Software
  • Basic Protocol 3: Constructing a Phylogenetic Tree using Bete
  • Basic Protocol 4: Phylogenomic Inference of Molecular Function using TreeNotator
  • Commentary
  • Literature Cited
  • Figures
Key References
   Bork and Koonin, 1998. See above.
  The authors of this paper identify common problems associated with function prediction by homology and present ways to avoid these errors.
   Eisen, 1998. See above.
  Jonathan Eisen's cogent presentation of the raison d'etre behind phylogenomic analysis for improving prediction of gene function.
   Sjölander, 2004. See above.
  A detailed view of the challenges in phylogenomic analysis, with a description of new methods for key tasks in a phylogenomic pipeline.
Internet Resources
  The BPG resources Web site includes a variety of user‐friendly resources for phylogenomic inference of protein molecular function. A description of all the available tools can also be found on the Web site.
