Using CATH‐Gene3D to Analyze the Sequence, Structure, and Function of Proteins

Ian Sillitoe1, Tony Lewis1, Christine Orengo1

1 University College London, London
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 1.28
DOI:  10.1002/0471250953.bi0128s50
Online Posting Date:  June, 2015
GO TO THE FULL TEXT:


The CATH database is a classification of protein structures found in the Protein Data Bank (PDB). Protein structures are chopped into individual units of structural domains, and these domains are grouped together into superfamilies if there is sufficient evidence that they have diverged from a common ancestor during the process of evolution. A sister resource, Gene3D, extends this information by scanning sequence profiles of these CATH domain superfamilies against many millions of known proteins to identify related sequences. Thus the combined CATH‐Gene3D resource provides confident predictions of the likely structural fold, domain organisation, and evolutionary relatives of these proteins. In addition, this resource incorporates annotations from a large number of external databases such as known enzyme active sites, GO molecular functions, physical interactions, and mutations. This unit details how to access and understand the information contained within the CATH‐Gene3D Web pages, the downloadable data files, and the remotely accessible Web services. © 2015 by John Wiley & Sons, Inc.

Keywords: protein structure; protein domain; protein classification; functional family; superfamily

Table of Contents

  • Introduction
  • Basic Protocol 1: Searching CATH with a New Protein Sequence
  • Alternate Protocol 1: Access CATH Sequence Scan Remotely
  • Basic Protocol 2: Search CATH with a New Protein Structure
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
Literature Cited

