Tempest: Accelerated MS/MS Database Search Software for Heterogeneous Computing Platforms

Mark E. Adamo1, Scott A. Gerber2

1 Norris Cotton Cancer Center, Geisel School at Dartmouth, Lebanon, New Hampshire, 2 Department of Biochemistry, Geisel School at Dartmouth, Lebanon, New Hampshire
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 13.29
DOI:  10.1002/cpbi.15
Online Posting Date:  September, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

MS/MS database search algorithms derive a set of candidate peptide sequences from in silico digest of a protein sequence database, and compute theoretical fragmentation patterns to match these candidates against observed MS/MS spectra. The original Tempest publication described these operations mapped to a CPU‐GPU model, in which the CPU (central processing unit) generates peptide candidates that are asynchronously sent to a discrete GPU (graphics processing unit) to be scored against experimental spectra in parallel. The current version of Tempest expands this model, incorporating OpenCL to offer seamless parallelization across multicore CPUs, GPUs, integrated graphics chips, and general‐purpose coprocessors. Three protocols describe how to configure and run a Tempest search, including discussion of how to leverage Tempest's unique feature set to produce optimal results. © 2016 by John Wiley & Sons, Inc.

Keywords: database search; GPGPU; mass spectrometry; neutral loss; OpenCL; parallel computing; peptide identification; proteomics

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Setting up a System Configuration
  • Basic Protocol 2: Setting Up a Search Method
  • Basic Protocol 3: Running a Tempest Search
  • Commentary
  • Literature Cited
  • Appendix A: tempest.config
  • Appendix B: tempest.method
  • Appendix C: Command‐Line Options
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: Setting up a System Configuration

  Necessary Resources
  • A computer running Linux, with OpenCL‐compatible hardware available and corresponding OpenCL drivers installed. Currently, Tempest runs exclusively as a command‐line executable. Windows and Mac versions are forthcoming.
  • Links to source code, downloads, and supporting resources can be found at http://proteomics.dartmouth.edu/k/software/tempest
  • The system configuration file can be created or modified with any basic text editor. It is advisable to reference the tempest.config file included with the software to ensure proper formatting

Basic Protocol 2: Setting Up a Search Method

  Necessary Resources
  • A search method file can be created or modified with any basic text editor. It is advisable to reference one of the pre‐written methods in the Tempest software package for guidance on formatting and recommended settings.

Basic Protocol 3: Running a Tempest Search

  Necessary Resources
  • The same system prerequisites listed in protocol 1 apply to this protocol. To run a search, Tempest requires a protein database file, a spectra file, a configuration file (see protocol 1), and a method file (see protocol 2). The protein database file must be in valid FASTA format. Tempest uses MSToolkit (https://github.com/mhoopmann/mstoolkit) to parse spectra files, so any format supported by MSToolkit may be used. Supported formats include mzXML, mzML, and ms2.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

  Brodbelt, J.S. 2015. Ion Activation Methods for Peptides and Proteins. Anal. Chem. 88:30‐51.
  Eng, J.K., McCormack, A.L., and Yates, J.R. 1994. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5:976‐989. doi: 10.1016/1044‐0305(94)80016‐2.
  Eng J.K., Jahan, T.A., and Hoopmann, M.R. 2013. Comet: An open‐source MS/MS sequence database search tool. Proteomics 13:22‐24. doi: 10.1002/pmic.201200439.
  Eng J.K., Fischer, B., Grossmann, J., and MacCoss, M.J. 2008. A Fast SEQUEST cross correlation algorithm. J. Proteome Res. 7:4598‐4602. doi: 10.1021/pr800420s.
  Faherty, B.K. and Gerber, S.A. 2010. MacroSEQUEST: Efficient candidate‐centric searching and high‐resolution correlation analysis for large‐scale proteomics data sets. Anal. Chem. 82: 6821‐6829 doi: 10.1021/ac100783x.
  Milloy, J.M., Faherty, B.K., and Gerber, S.A. 2012. Tempest: GPU‐CPU computing for high‐throughput database spectral matching. J. Proteome Res. 11:3581‐3591 doi: 10.1021/pr300338p.
Internet Resources
  https://github.com/mhoopmann/mstoolkit
  MSToolkit file parsing library.
  http://www.uniprot.org/
  UniProt proteome database for FASTA file retrieval.
  http://proteomics.dartmouth.edu/k/software/tempest
  Point‐of‐contact for Tempest files and code.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library