Using PepExplorer to Filter and Organize De Novo Peptide Sequencing Results

Felipe da Veiga Leprevost1, Valmir C. Barbosa2, Paulo Costa Carvalho3

1 Department of Pathology, University of Michigan, Ann Arbor, 2 Systems Engineering and Computer Science Program, Federal University of Rio de Janeiro, Rio de Janeiro, 3 Computational Mass Spectrometry Group, Carlos Chagas Institute–Fiocruz. Curitiba, Paraná
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 13.27
DOI:  10.1002/0471250953.bi1327s51
Online Posting Date:  September, 2015
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


PepExplorer aids in the biological interpretation of de novo sequencing results; this is accomplished by assembling a list of homolog proteins obtained by aligning results from widely adopted de novo sequencing tools against a target‐decoy sequence database. Our tool relies on pattern recognition to ensure that the results satisfy a user‐given false‐discovery rate (FDR). For this, it employs a radial basis function neural network that considers the precursor charge states, de novo sequencing scores, the peptide lengths, and alignment scores. PepExplorer is recommended for studies addressing organisms with no genomic sequence available. PepExplorer is integrated into the PatternLab for proteomics environment, which makes available various tools for downstream data analysis, including the resources for quantitative and differential proteomics. © 2015 by John Wiley & Sons, Inc.

Keywords: de novo sequencing; mass spectrometry; proteomics

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Creating and Formatting a Database for PepExplorer Analysis
  • Basic Protocol 2: Adjusting the Parameters
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
  Bandeira, N. 2007. Spectral networks: A new approach to de novo discovery of protein sequences and posttranslational modifications. BioTechniques 42:687, 689, 691 passim.
  Bartlett, J.M.S. and Stirling, D. 2003. A Short history of the polymerase chain reaction. In PCR Protocols pp. 3‐6. Humana Press, Totowa, N.J. Available at‐59259‐384‐4:3 [Accessed April 16, 2015].
  Borges, D., Perez‐Riverol, Y., Nogueira, F.C.S., Domont, G.B., Noda, J., da Veiga Leprevost, F., Besada, V., França, F.M.G., Barbosa, V.C., Sánchez, A., and Carvalho, P.C. 2013. Effectively addressing complex proteomic search spaces with peptide spectrum matching. Bioinformatics 29:1343‐1344.
  Carvalho, P. C., Fischer, J. S. G., Xu, T., Yates, J. R. and Barbosa, V. C. 2012. PatternLab: From mass spectra to label‐free differential shotgun proteomics. Curr. Protoc. Bioinform. 40:13.19.1‐13.19.18.
  Chi, H., Sun, R.‐X., Yang, B., Song, C.‐Q., Wang, L.‐H., Liu, C., Fu, Y., Yuan, Z.‐F., Wang, H.‐P., He, S.‐M., and Dong, M.‐Q. 2010. pNovo: De novo peptide sequencing and identification using HCD spectra. J. Proteome Res. 9:2713‐2724.
  Eng, J.K., McCormack, A.L., and Yates, J.R. 1994. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom.5:976‐989.
  Frank, A. and Pevzner, P. 2005. PepNovo: De novo peptide sequencing via probabilistic network modeling. Anal. Chem. 77:964‐973.
  Junqueira, M. and Carvalho, P.C. 2012. Tools and challenges for diversity‐driven proteomics in Brazil. Proteomics 12:2601‐2606.
  Junqueira, M., Spirin, V., Balbuena, T.S., Thomas, H., Adzhubei, I., Sunyaev, S., and Shevchenko, A. 2008. Protein identification pipeline for the homology‐driven proteomics. J. Proteomics 71:346‐356.
  Leprevost, F.V., Lima, D.B., Crestani, J., Perez‐Riverol, Y., Zanchin, N., Barbosa, V.C., and Carvalho, P.C. 2013. Pinpointing differentially expressed domains in complex protein mixtures with the cloud service of PatternLab for Proteomics. J. Proteomics 89:179‐182.
  Leprevost, F.V., Valente, R.H., Lima, D.B., Perales, J., Melani, R., Yates, J.R., Barbosa, V.C., Junqueira, M., and Carvalho, P.C. 2014. PepExplorer: A similarity‐driven tool for analyzing de novo sequencing results. Mol. Cell Proteomics 13:2480‐2489.
  Ma, B., Zhang, K., Hendrie, C., Liang, C., Li, M., Doherty‐Kirby, A., and Lajoie, G. 2003. PEAKS: Powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom. 17:2337‐2342.
  Muth, T., Kolmeder, C.A., Salojärvi, J., Keskitalo, S., Varjosalo, M., Verdam, F.J., Rensen, S.S., Reichl, U., de Vos, W.M., Rapp, E., and Martens, L. 2015. Navigating through metaproteomics data: A logbook of database searching. Proteomics [Epub ahead of print].
  Na, S., Bandeira, N., and Paek, E. 2012. Fast multi‐blind modification search through tandem mass spectrometry. Mol. Cell Proteomics 11:M111.010199.
  Punta, M., Coggill, P.C., Eberhardt, R.Y., Mistry, J., Tate, J., Boursnell, C., Pang, N., Forslund, K., Ceric, G., Clements, J., Heger, A., Holm, L., Sonnhammer, E.L.L., Eddy, S.R., Bateman, A., and Finn, R.D. 2012. The Pfam protein families database. Nucleic Acids Res. 40:D290‐D301.
  Saiki, R.K., Scharf, S., Faloona, F., Mullis, K.B., Horn, G.T., Erlich, H.A., and Arnheim, N. 1985. Enzymatic amplification of beta‐globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science (New York, N.Y.) 230:1350‐1354.
  Seidler, J., Zinn, N., Boehm, M.E., and Lehmann, W.D. 2010. De novo sequencing of peptides by MS/MS. Proteomics 10:634‐649.
  Shevchenko, A., Sunyaev, S., Loboda, A., Shevchenko, A., Bork, P., Ens, W., and Standing, K.G. 2001. Charting the proteomes of organisms with unsequenced genomes by MALDI‐quadrupole time‐of‐flight mass spectrometry and BLAST homology searching. Anal. Chem. 73:1917‐1926.
PDF or HTML at Wiley Online Library