Using PSEA‐Quant for Protein Set Enrichment Analysis of Quantitative Mass Spectrometry‐Based Proteomics

Mathieu Lavallée‐Adam1, John R. Yates2

1 Department of Chemical Physiology, The Scripps Research Institute, La Jolla, California, 2 Department of Chemical Physiology and Molecular and Cellular Neurobiology, The Scripps Research Institute, La Jolla, California
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 13.28
DOI:  10.1002/0471250953.bi1328s53
Online Posting Date:  March, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


PSEA‐Quant analyzes quantitative mass spectrometry–based proteomics datasets to identify enrichments of annotations contained in repositories such as the Gene Ontology and Molecular Signature databases. It allows users to identify the annotations that are significantly enriched for reproducibly quantified high abundance proteins. PSEA‐Quant is available on the Web and as a command‐line tool. It is compatible with all label‐free and isotopic labeling‐based quantitative proteomics methods. This protocol describes how to use PSEA‐Quant and interpret its output. The importance of each parameter as well as troubleshooting approaches are also discussed. © 2016 by John Wiley & Sons, Inc.

Keywords: gene set enrichment analysis; gene ontology; quantitative proteomics; functional enrichment analysis; mass spectrometry

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Strategic Planning
  • Basic Protocol 1: Using Psea‐Quant to Analyze a Quantitative Proteomics Dataset Involving a Single Experimental Condition (Absolute Quantification)
  • Support Protocol 1: Converting Protein Identifiers to Gene Names
  • Basic Protocol 2: Using Psea‐Quant to Analyze a Quantitative Proteomics Dataset Involving Two Experimental Conditions (Relative Quantification)
  • Alternate Protocol 1: Using PSEA‐Quant on the Command Line
  • Guidelines for Understanding Results
  • Commentary
  • Figures
  • Tables
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
  Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel‐Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., and Sherlock, G. 2000. Gene Ontology: Tool for the unification of biology. Nat. Genet. 25:25‐29. doi: 10.1038/75556.
  Bauer, S., Grossmann, S., Vingron, M., and Robinson, P.N. 2008. Ontologizer 2.0‐a multifunctional tool for GO term enrichment analysis and data exploration. Bioinformatics 24:1650‐1651. doi: 10.1093/bioinformatics/btn250.
  Baxevanis, A.D. 2012. Searching Online Mendelian Inheritance in Man (OMIM) for information on genetic loci involved in human disease. Curr. Protoc. Bioinform. 37:1.2.1‐1.2.10.
  Beissbarth, T. and Speed, T. 2004. GOstat: Find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 881.
  Blake, J.A. and Harris, M.A. 2008. The Gene Ontology (GO) Project: Structured vocabularies for molecular biology and their application to genome and expression analysis. Curr. Protoc. Bioinform. 23:7.2.1‐7.2.9.
  Boersema, P.J., Raijmakers, R., Lemeer, S., Mohammed, S., and Heck, A.J.R. 2009. Multiplex peptide stable isotope dimethyl labeling for quantitative proteomics. Nat. Protoc. 4:484‐494. doi: 10.1038/nprot.2009.21.
  Chelius, D., Zhang, T., Wang, G., and Shen, R.‐F. 2003. Global protein identification and quantification technology using two‐dimensional liquid chromatography nanospray mass spectrometry. Anal. Chem. 75:6658‐6665. doi: 10.1021/ac034607k.
  Cunningham, F., Amode, M.R., Barrell, D., Beal, K., Billis, K., Brent, S., Carvalho‐Silva, D., Clapham, P., Coates, G., Fitzgerald, S., Gil, L., Girón, C.G., Gordon, L., Hourlier, T., Hunt, S.E., Janacek, S.H., Johnson, N., Juettemann, T., Kähäri, A.K., Keenan, S., Martin, F.J., Maurel, T., McLaren, W., Murphy, D.N., Nag, R., Overduin, B., Parker, A., Patricio, M., Perry, E., Pignatelli, M., Riat, H.S., Sheppard, D., Taylor, K., Thormann, A., Vullo, A., Wilder, S.P., Zadissa, A., Aken, B.L., Birney, E., Harrow, J., Kinsella, R., Muffato, M., Ruffier, M., Searle, S.M., Spudich, G., Trevanion, S.J., Yates, A., Zerbino, D.R., and Flicek, P. 2015. Ensembl 2015. Nucleic Acids Res. 43:D662‐D669. doi: 10.1093/nar/gku1010.
  Dennis, G. Jr., Sherman, B.T., Hosack, D.A., Yang, J., Gao, W., Lane, H.C., and Lempicki, R.A. 2003. DAVID: Database for annotation, visualization, and integrated discovery. Genome Biol. 4:P3. doi: 10.1186/gb-2003-4-5-p3.
  Eden, E., Navon, R., Steinfeld, I., Lipson, D., and Yakhini, Z. 2009. GOrilla: A tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics 10:48. doi: 10.1186/1471-2105-10-48.
  Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., and McKusick, V.A. 2005. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33:D514‐D517. doi: 10.1093/nar/gki033.
  Hsu, J.‐L., Huang, S.‐Y., Chow, N.‐H., and Chen, S.‐H. 2003. Stable‐isotope dimethyl labeling for quantitative proteomics. Anal. Chem. 75:6843‐6852. doi: 10.1021/ac0348625.
  Huang, D.W., Sherman, B.T., Zheng, X., Yang, J., Imamichi, T., Stephens, R. and Lempicki, R.A. 2009. Extracting biological meaning from large gene lists with DAVID. Curr. Protoc. Bioinform. 27:13.11.1‐13.11.13.
  Lavallée‐Adam, M., Park, S.K.R., Martínez‐Bartolomé, S., He, L., and Yates III, J.R. 2015. From raw data to biological discoveries: A computational analysis pipeline for mass spectrometry‐based proteomics. J. Am. Soc. Mass Spectrom. 26:1820‐1826.
  Lavallée‐Adam, M., Rauniyar, N., McClatchy, D.B., and Yates III, J.R. 2014. PSEA‐Quant: A protein set enrichment analysis on label‐free and label‐based protein quantification data. J. Proteome Res. 13:5496‐5509. doi: 10.1021/pr500473n.
  Liberzon, A., Subramanian, A., Pinchback, R., Thorvaldsdóttir, H., Tamayo, P., and Mesirov, J.P. 2011. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27:1739‐1740. doi: 10.1093/bioinformatics/btr260.
  Liu, H., Sadygov, R.G., and Yates, J.R. 2004. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 76:4193‐4201. doi: 10.1021/ac0498563.
  Ong, S.‐E., Blagoev, B., Kratchmarova, I., Kristensen, D.B., Steen, H., Pandey, A., and Mann, M., 2002. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 1:376‐386. doi: 10.1074/mcp.M200025-MCP200.
  Pankow, S., Bamberger, C., Calzolari, D., Martínez‐Bartolomé, S., Lavallée‐Adam, M., Balch, W.E., and Yates III, J.R. 2015. ΔF508 CFTR interactome remodeling promotes rescue of Cystic Fibrosis. Nature 528:510‐5166. doi: 10.1038/nature15729.
  Radulovic, D., Jelveh, S., Ryu, S., Hamilton, T.G., Foss, E., Mao, Y., and Emili, A. 2004. Informatics platform for global proteomic profiling and biomarker discovery using liquid chromatography‐tandem mass spectrometry. Mol. Cell. Proteomics 3:984‐997. doi: 10.1074/mcp.M400061-MCP200.
  Ross, P.L., Huang, Y.N., Marchese, J.N., Williamson, B., Parker, K., Hattan, S., Khainovski, N., Pillai, S., Dey, S., Daniels, S., Purkayastha, S., Juhasz, P., Martin, S., Bartlet‐Jones, M., He, F., Jacobson, A., and Pappin, D.J. 2004. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine‐reactive isobaric tagging reagents. Mol. Cell. Proteomics 3:1154‐1169. doi: 10.1074/mcp.M400129-MCP200.
  Supek, F., Bošnjak, M., Škunca, N., and Šmuc, T. 2011. REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS One 6:e21800. doi: 10.1371/journal.pone.0021800.
  Thompson, A., Schäfer, J., Kuhn, K., Kienle, S., Schwarz, J., Schmidt, G., Neumann, T., and Hamon, C. 2003. Tandem mass tags: A novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 75:1895‐1904. doi: 10.1021/ac0262560.
  UniProt Consortium. 2014. UniProt: A hub for protein information. Nucleic Acids Res. gku989.
  Washburn, M.P., Ulaszek, R., Deciu, C., Schieltz, D.M., and Yates, J.R. 2002. Analysis of quantitative proteomic data generated via multidimensional protein identification technology. Anal. Chem. 74:1650‐1657. doi: 10.1021/ac015704l.
  Yao, X., Freas, A., Ramirez, J., Demirev, P.A., and Fenselau, C. 2001. Proteolytic 18O labeling for comparative proteomics: Model studies with two serotypes of adenovirus. Anal. Chem. 73:2836‐2842. doi: 10.1021/ac001404c.
  Zhang, Y., Wen, Z., Washburn, M.P., and Florens, L. 2010. Refinements to label free proteome quantitation: How to deal with peptides shared by multiple proteins. Anal. Chem. 82:2272‐2281. doi: 10.1021/ac9023999.
  Zong, N.C., Li, H., Li, H., Lam, M.P.Y., Jimenez, R.C., Kim, C.S., Deng, N., Kim, A.K., Choi, J.H., Zelaya, I., Liem, D., Meyer, D., Odeberg, J., Fang, C., Lu, H.J., Xu, T., Weiss, J., Duan, H., Uhlen, M., Yates, J.R. 3rd, Apweiler, R., Ge, J., Hermjakob, H., and Ping, P. 2013. Integration of cardiac proteome biology and medicine by a specialized knowledgebase. Circ. Res. 113:1043‐1053. doi: 10.1161/CIRCRESAHA.113.301151.
  Zybailov, B., Mosley, A.L., Sardiu, M.E., Coleman, M.K., Florens, L., and Washburn, M.P. 2006. Statistical analysis of membrane proteome expression changes in Saccharomyces cerevisiae. J. Proteome Res. 5:2339‐2347. doi: 10.1021/pr060161n.
Key References
  Lavallée‐Adam et al., 2014. See above.
  Description of PSEA‐Quant's algorithm.
Internet Resources
  PSEA‐Quant Web server.
  PSEA‐Quant example input and output files.‐
  PSEA‐Quant source code.
  UniProt protein identifier conversion tool.
  Ensembl tool for protein ortholog mapping.
  REVIGO, GO term summarizing tool.
PDF or HTML at Wiley Online Library

Supplementary Material