OrthoMCL is an algorithm for grouping proteins into ortholog groups based on their sequence similarity. OrthoMCL‐DB is a public database that allows users to browse and view ortholog groups that were pre‐computed using the OrthoMCL algorithm. Version 4 of this database contained 116,536 ortholog groups clustered from 1,270,853 proteins obtained from 88 eukaryotic genomes, 16 archaean genomes, and 34 bacterial genomes. Future versions of OrthoMCL‐DB will include more proteomes as more genomes are sequenced. Here, we describe how you can group your proteins of interest into ortholog clusters using two different means provided by the OrthoMCL system. The OrthoMCL‐DB Web site has a tool for uploading and grouping a set of protein sequences, typically representing a proteome. This method maps the uploaded proteins to existing groups in OrthoMCL‐DB. Alternatively, if you have proteins from a set of genomes that need to be grouped, you can download, install, and run the stand‐alone OrthoMCL software. Curr. Protoc. Bioinform. 35:6.12.1‐6.12.19. © 2011 by John Wiley & Sons, Inc.

Keywords: OrthoMCL; ortholog groups; paralog; proteome; Markov clustering; reciprocal best hits; MCL

Table of Contents

  • Introduction
  • Strategic Planning
  • Basic Protocol 1: Assign a Proteome to OrthoMCL‐DB Groups
  • Basic Protocol 2: Create Ortholog Groups from Your Proteomes Using the OrthoMCL Software
  • Support Protocol 1: Downloading, Installing, and Configuring the OrthoMCL Programs
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
Literature Cited

