PatternLab: From Mass Spectra to Label‐Free Differential Shotgun Proteomics

Paulo C. Carvalho1, Juliana S. G. Fischer1, Tao Xu2, John R. Yates2, Valmir C. Barbosa3

1 Carlos Chagas Institute–Fiocruz, Paraná, Brazil, 2 Department of Cell Biology, The Scripps Research Institute, La Jolla, California, 3 Systems Engineering and Computer Science Program, COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 13.19
DOI:  10.1002/0471250953.bi1319s40
Online Posting Date:  December, 2012
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


PatternLab for proteomics is a self‐contained computational environment for analyzing shotgun proteomic data. Recent improvements incorporate modules to facilitate the computational analysis, such as FastaDBXtractor for sequence database preparation and ProLuCID runner for simplifying and managing the protein identification search engine; modules for pushing the limits on proteomics standards, such as SEPro, which relies on a semi‐labeled decoy approach for increasing confidence in filtering and organizing peptide spectrum matches; and modules with novel features, such as SEProQ for enabling label‐free quantitation by extracted ion chromatograms according to a distributed normalized ion abundance factor approach (dNIAF). Existing modules were also improved, such as the TFold module for pinpointing differentially expressed proteins. These new modules are integrated into the previously described arsenal of tools for further data analysis. Here we provide detailed instructions for operating and understanding them. Curr. Protoc. Bioinform. 40:13.19.1‐13.19.18. © 2012 by John Wiley & Sons, Inc.

Keywords: semi‐labeled decoy approach; filtering PSMs; dNIAF; quantitative proteomics; protein identification

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Preparing a Sequence Database to be Searched by ProLuCID or the Academic SEQUEST
  • Basic Protocol 2: Obtaining PSMs with ProLuCID and ProLuCID Runner
  • Basic Protocol 3: Filtering Results with the Search Engine Processor (SEPro)
  • Basic Protocol 4: Quantitating PSMs by dNIAFs with SEProQ
  • Basic Protocol 5: Using Regrouper to Port SEPro Spectral Counting or dNIAF Results to PatternLab
  • Basic Protocol 6: Using the Updated TFold Module for Pinpointing Differentially Expressed Proteins
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
   Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel‐Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., and Sherlock, G. 2000. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat.Genet. 25:25‐29.
   Barboza, R., Cociorva, D., Xu, T., Barbosa, V.C., Perales, J., Valente, R.H., Franca, F.M., Yates, J.R. III, and Carvalho, P.C. 2011. Can the false‐discovery rate be misleading? Proteomics 11:4105‐4108.
   Benjamini, Y. and Hochberg, Y. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57:289‐300.
   Carvalho, P.C., Fischer, J.S., Chen, E.I., Yates, J.R. III, and Barbosa, V.C. 2008a. PatternLab for proteomics: A tool for differential shotgun proteomics. BMC Bioinformatics 9:316.
   Carvalho, P.C., Hewel, J., Barbosa, V.C., and Yates, J.R. III 2008b. Identifying differences in protein expression levels by spectral counting and feature selection. Genet. Mol. Res. 7:342‐356.
   Carvalho, P.C., Fischer, J.S., Chen, E.I., Domont, G.B., Carvalho, M.G., Degrave, W.M., Yates, J.R. III, and Barbosa, V.C. 2009a. GO Explorer: A gene‐ontology tool to aid in the interpretation of shotgun proteomics data. Proteome Sci. 7:6.
   Carvalho, P.C., Xu, T., Han, X., Cociorva, D., Barbosa, V.C., and Yates, J.R. III 2009b. YADA: A tool for taking the most out of high‐resolution spectra. Bioinformatics 25:2734‐2736.
   Carvalho, P.C., Fischer, J.S., Perales, J., Yates, J.R., Barbosa, V.C., and Bareinboim, E. 2011. Analyzing marginal cases in differential shotgun proteomics. Bioinformatics. 27:275‐276.
   Carvalho, P.C., Fischer, J.S., Xu, T., Cociorva, D., Balbuena, T.S., Valente, R.H., Perales, J., Yates, J.R. III, and Barbosa, V.C. 2012a. Search engine processor: Filtering and organizing PSMs. Proteomics. 12:944‐949.
   Carvalho, P.C., Yates, I. Jr., and Barbosa, V.C. 2012b. Improving the TFold test for differential shotgun proteomics. Bioinformatics 28:1652‐1654.
   Eng, J.K., McCormack, A., Yates, I. Jr., and Yates, J.R. III 1994. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5:976‐989.
   Liu, H., Sadygov, R.G., and Yates, J.R. III 2004. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 76:4193‐4201.
   Makarov, A. 2000. Electrostatic axially harmonic orbital trapping: A high‐performance technique of mass analysis. Anal.Chem. 72:1156‐1162.
   McDonald, W.H., Tabb, D.L., Sadygov, R.G., MacCoss, M.J., Venable, J., Graumann, J., Johnson, J.R., Cociorva, D., and Yates, J.R. III 2004. MS1, MS2, and SQT‐three unified, compact, and easily parsed file formats for the storage of shotgun proteomic spectra and identifications. Rapid Commun. Mass Spectrom. 18:2162‐2168.
   Washburn, M.P., Wolters, D., and Yates, J.R. III 2001. Large‐scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19:242‐247.
   Weiss, M., Schrimpf, S., Hengartner, M.O., Lercher, M.J., and von Mering, C. 2010. Shotgun proteomics data from multiple organisms reveals remarkable quantitative conservation of the eukaryotic core proteome. Proteomics 10:1297‐1306.
   Xu, T., Venable, J.D., Park, S.K., Cociorva, D., Lu, B., Liao, L., Wohlschlegel, J., Hewel, J., and Yates, J.R. III 2006. ProLuCID, a fast and sensitive tandem mass spectra‐based protein identification program. Mol. Cell Proteomics 5:S174.
   Yates, J.R. III, Park, S.K., Delahunty, C.M., Xu, T., Savas, J.N., Cociorva, D., and Carvalho, P.C. 2012. Toward objective evaluation of proteomic algorithms. Nat. Methods 9:455‐456.
   Zhang, B., Chambers, M.C., and Tabb, D.L. 2007. Proteomic parsimony through bipartite graph analysis improves accuracy and transparency. J. Proteome. Res. 6:3549‐3557.
   Zhang, Y., Wen, Z., Washburn, M.P., and Florens, L. 2010. Refinements to label free proteome quantitation: How to deal with peptides shared by multiple proteins. Anal. Chem. 82:2272‐2281.
PDF or HTML at Wiley Online Library