Using GFS to Identify Encoding Genomic Loci from Protein Mass Spectral Data

Mark R. Holmes1, Morgan C. Giddings2

1 Department of Microbiology and Immunology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 2 Joint Department of Biomedical Engineering, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina and North Carolina State University, Raleigh, North Carolina
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 13.9
DOI:  10.1002/0471250953.bi1309s21
Online Posting Date:  March, 2008
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Genome‐based peptide fingerprint scanning (GFS) directly maps several types of protein mass spectral (MS) data to the loci in the genome that may have encoded for the protein. This process can be used either for protein identification or for proteogenomic mapping, which is gene‐finding and annotation based on proteomic data. Inputs to the program are one or more mass spectrometry files from peptide mass fingerprinting and/or tandem MS (MS/MS) along with one or more sequences to search them against, and the output is the coordinates of any matches found. This unit describes the use of GFS and subsequent results analysis. Curr. Protoc. Bioinform. 21:13.9.1‐13.9.20. © 2008 by John Wiley & Sons, Inc.

Keywords: mass spectrometry; protein identification; proteogenomics

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Using GFS on a Local Machine with PMF and Optional MS/MS Data
  • Basic Protocol 2: Using GFS on a Local Machine with Shotgun MS Data
  • Alternate Protocol 1: Using the GFS Website
  • Support Protocol 1: Obtaining and Installing GFS on a Local Machine
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

   The ENCODE Project Consortium. 2004. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306: 636‐640.
   Fenyö, D. and Beavis, R.C. 2003. A method for assessing the statistical significance of mass spectrometry based protein identifications using general scoring schemes. Anal. Chem. 075: 768‐774.
   Khatun, J., Hamlett, E., and Giddings, M.C. 2008. Incorporating sequence information into the scoring function: A hidden Markov model for improved peptide identification. Bioinformatics [Advance Access]. In press.
   Mann, M., Hojrup, P., and Roepstorff, P. 1993. Use of mass spectrometric molecular weight information to identify proteins in sequence databases. Biol. Mass Spectrom. 22: 338‐345.
   Smith, T.F., Waterman, M.S., and Fitch, W.M. 1981. Comparative biosequence metrics. J. Mol. Evol. 18: 38‐46.
   Washburn, M.P., Wolters, D., and Yates, J.R. 3rd. 2001. Large‐scale analysis of the yeast proteome by multidimensional protein identification technology. Nature Biotech. 19: 242‐247.
   Yates, J.R. 3rd, Eng, J.K., and McCormack, A.L. 1995. Mining genomes: Correlating tandem mass spectra of modified and unmodified peptides to sequences in nucleotide databases. Anal. Chem. 67: 3202‐3210.
Internet Resources
   http://gfs.unc.edu
  The Website for GFS executables, documentation, source code, and Web‐based peptide searching.
   http://gnustep.org
  If using the GFS application locally on Windows or Linux, the Website to obtain the GNUstep runtime libraries. Not needed for Mac OS X or if searching through the GFS Website.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library