Exploring Short Linear Motifs Using the ELM Database and Tools

Marc Gouw1, Hugo Sámano‐Sánchez1, Kim Van Roey1, Francesca Diella1, Toby J. Gibson1, Holger Dinkel2

1 Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, 2 Leibniz‐Institute on Aging—Fritz Lipmann Institute (FLI), Jena
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 8.22
DOI:  10.1002/cpbi.26
Online Posting Date:  June, 2017
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

The Eukaryotic Linear Motif (ELM) resource is dedicated to the characterization and prediction of short linear motifs (SLiMs). SLiMs are compact, degenerate peptide segments found in many proteins and essential to almost all cellular processes. However, despite their abundance, SLiMs remain largely uncharacterized. The ELM database is a collection of manually annotated SLiM instances curated from experimental literature. In this article we illustrate how to browse and search the database for curated SLiM data, and cover the different types of data integrated in the resource. We also cover how to use this resource in order to predict SLiMs in known as well as novel proteins, and how to interpret the results generated by the ELM prediction pipeline. The ELM database is a very rich resource, and in the following protocols we give helpful examples to demonstrate how this knowledge can be used to improve your own research. © 2017 by John Wiley & Sons, Inc.

Keywords: short linear motifs; bioinformatics; protein‐protein interaction; molecular switches; cell regulation

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Exploring the Content of the ELM Database
  • Basic Protocol 2: Exploring the Content of the ELM Database Using the General Search
  • Basic Protocol 3: Detecting Short Linear Motifs in Protein Sequences
  • Basic Protocol 4: Detecting Short Linear Motifs in Novel Protein Sequences
  • Basic Protocol 5: Searching the ELM Database Using the Rest API
  • Basic Protocol 6: Detecting Short Linear Motifs in Sequences Using the Rest API
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

  Berman, H. M., Battistuz, T., Bhat, T. N., Bluhm, W. F., Bourne, P. E., Burkhardt, K., … Zardecki, C. (2002). The protein data bank. Acta Crystallographica Section D Biological Crystallography, 58, 899–907. doi: 10.1107/S0907444902003451.
  Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., … Bourne, P. E. (2000). The protein data bank. Nucleic Acids Research, 28, 235–242. doi: 10.1093/nar/28.1.235.
  Chica, C., Labarga, A., Gould, C. M., López, R., & Gibson, T. J. (2008). A tree‐based conservation scoring method for short linear motifs in multiple alignments of protein sequences. BMC Bioinformatics, 9, 229. doi: 10.1186/1471‐2105‐9‐229.
  Davey, N. E., Cyert, M. S., & Moses, A. M. (2015). Short linear motifs: Ex nihilo evolution of protein regulation. Cell Communication and Signaling, 13, 43. doi: 10.1186/s12964‐015‐0120‐z.
  Davey, N. E., Van Roey, K., Weatheritt, R. J., Toedt, G., Uyar, B., Altenberg, B., … Gibson, T. J. (2012). Attributes of short linear motifs. Molecular BioSystems, 8, 268–281. doi: 10.1039/C1MB05231D.
  de Brito, A. C. F., Carvalho, C. B., Santos, F., Gazzinelli, R. T., Oliveira, S. C., Azevedo, V., & Teixeira, R. S. M. (2004). Chromobacterium violaceum genome: Molecular mechanisms associated with pathogenicity. Genetics and Molecular Research : GMR, 3, 148–161.
  Diella, F. (2008). Understanding eukaryotic linear motifs and their role in cell signaling and regulation. Frontiers in Bioscience, Volume, 6580. doi: 10.2741/3175.
  Dinkel, H., Chica, C., Via, A., Gould, C. M., Jensen, L. J., Gibson, T. J., & Diella, F. (2011). Phospho.elm: A database of phosphorylation sites‐update 2011. Nucleic Acids Research, 39, D261–267. doi: 10.1093/nar/gkq1104.
  Dinkel, H., Michael, S., Weatheritt, R. J., Davey, N. E., Van Roey, K., Altenberg, B., … Gibson, T. J. (2012). Elm‐the database of eukaryotic linear motifs. Nucleic Acids Research, 40, D242–251. doi: 10.1093/nar/gkr1064.
  Dinkel, H., Van Roey, K., Michael, S., Davey, N. E., Weatheritt, R. J., Born, D., … Gibson, T. J. (2014). The eukaryotic linear motif resource ELM: 10 years and counting. Nucleic Acids Research, 42, D259‐D266. doi: 10.1093/nar/gkt1047.
  Dodd, D. A., Worth, R. G., Rosen, M. K., Grinstein, S., van Oers, C. N. S., & Hansen, E. J. (2014). The haemophilus ducreyi LspA1 protein inhibits phagocytosis by using a new mechanism involving activation of c‐terminal src kinase. mBio, 5, e01178‐14‐e01178‐14. doi: 10.1128/mBio.01178‐14.
  Dosztányi, Z., Csizmok, V., Tompa, P., & Simon, I. (2005). Iupred: Web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics (Oxford, England), 21, 3433–3434. doi: 10.1093/bioinformatics/bti541.
  Fielding, R. T., & Taylor, R. N. (2002). Principled design of the modern web architecture. ACM Transactions on Internet Technology, 2, 115–150. doi: 10.1145/514183.514185.
  Finn, R. D., Attwood, T. K., Babbitt, P. C., Bateman, A., Bork, P., Bridge, A. J., … Mitchell, A. L. (2017). In‐terpro in 2017‐beyond protein family and domain annotations. Nucleic Acids Research, 45, D190‐D199. doi: 10.1093/nar/gkw1107.
  Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L., … Bateman, A. (2016). The pfam protein families database: Towards a more sustainable future. Nucleic Acids Research, 44, D279–285. doi: 10.1093/nar/gkv1344.
  Furuhashi, M., Kitamura, K., Adachi, M., Miyoshi, T., Wakida, N., Ura, N., … Shimamoto, K. (2005). Liddle's syndrome caused by a novel mutation in the proline‐rich PY motif of the epithelial sodium channel β‐subunit. The Journal of Clinical Endocrinology & Metabolism, 90, 340–344. doi: 10.1210/jc.2004‐1027.
  Gene Ontology Consortium. (2017). Expansion of the gene ontology knowledgebase and resources. Nucleic Acids Research, 45, D331‐D338. doi: 10.1093/nar/gkw1108.
  Gibson, T. J., Dinkel, H., Van Roey, K., & Diella, F. (2015). Experimental detection of short regulatory motifs in eukaryotic proteins: Tips for good practice as well as for bad. Cell Communication and Signaling, 13, 42. doi: 10.1186/s12964‐015‐0121‐y.
  Jehl, P., Manguy, J., Shields, D. C., Higgins, D. G., & Davey, N. E. (2016). Proviz‐a web‐based visualization tool to investigate the functional and evolutionary features of protein sequences. Nucleic Acids Research, 44, W11–15. doi: 10.1093/nar/gkw265.
  Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M., & Tanabe, M. (2016). Kegg as a reference resource for gene and protein annotation. Nucleic Acids Research, 44, D457–462. doi: 10.1093/nar/gkv1070.
  Kaniga, K., Uralil, J., Bliska, J. B., & Galán, J. E. (1996). A secreted protein tyrosine phosphatase with modular effector domains in the bacterial pathogen Salmonella typhimurlum. Molecular Microbiology, 21, 633–641. doi: 10.1111/j.1365‐2958.1996.tb02571.x.
  Kerrien, S., Orchard, S., Montecchi‐Palazzi, L., Aranda, B., Quinn, A. F., Vinod, N., … Hermjakob, H. (2007). Broadening the horizon‐level 2.5 of the hupo‐psi format for molecular interactions. BMC Biology, 5, 44. doi: 10.1186/1741‐7007‐5‐44.
  Kim, J., Kim, I., Yang, J.‐S., Shin, Y.‐E., Hwang, J., Park, S., … Kim, S. (2012). Rewiring of PDZ domain‐ligand interaction network contributed to eukaryotic evolution. PLoS Genetics, 8, e1002510. doi: 10.1371/journal.pgen.1002510.
  Lee, R. V. D., Buljan, M., Lang, B., Weatheritt, R. J., Daughdrill, G. W., Dunker, A. K., … Babu, M. M. (2015). Classification of intrinsically disordered regions and proteins. Progress in Biophysics and Molecular Biology.
  Letunic, I., Doerks, T., & Bork, P. (2015). Smart: Recent updates, new developments and status in 2015. Nucleic Acids Research, 43, D257–260. doi: 10.1093/nar/gku949.
  Linding, R., Russell, R. B., Neduva, V., & Gibson, T. J. (2003). Globplot: Exploring protein sequences for globularity and disorder. Nucleic Acids Research, 31, 3701–3708. doi: 10.1093/nar/gkg519.
  McKusick, V. A. (2007). Mendelian inheritance in man and its online version, omim. American Journal of Human Genetics, 80, 588–604. doi: 10.1086/514346.
  NCBI Resource Coordinators. (2017). Database resources of the national center for biotechnology information. Nucleic Acids Research, 45, D12‐D17. doi: https://doi.org/10.1016/j.bbamem.2015.08.002.
  Schultz, J., Milpetz, F., Bork, P., & Ponting, C. P. (1998). Smart, a simple modular architecture research tool: Identification of signaling domains. Proceedings of the National Academy of Sciences of the United States of America, 95, 5857–5864. doi: 10.1073/pnas.95.11.5857.
  Selbach, M., Paul, F. E., Brandt, S., Guye, P., Daumke, O., Backert, S., … Mann, M. (2009). Host cell interactome of tyrosine‐phosphorylated bacterial proteins. Cell Host & Microbe, 5, 397–403. doi: 10.1016/j.chom.2009.03.004.
  Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., … Higgins, D. G. (2014). Fast, scalable generation of high‐quality protein multiple sequence alignments using clustal omega. Molecular Systems Biology, 7, 539. doi: 10.1038/msb.2011.75.
  Suzek, B. E., Huang, H., McGarvey, P., Mazumder, R., & Wu, C. H. (2007). Uniref: Comprehensive and non‐redundant uniprot reference clusters. Bioinformatics (Oxford, England), 23, 1282–1288. doi: 10.1093/bioinformatics/btm098.
  Tompa, P., Davey, N. E., Gibson, T. J., & Babu, M. M. (2014). A million peptide motifs for the molecular biologist. Molecular Cell, 55, 161–169. doi: 10.1016/j.molcel.2014.05.032.
  Tsutsumi, R. (2003). Attenuation of helicobacter pylori CagAmiddle dotSHP‐2 signaling by interaction between CagA and c‐terminal src kinase. Journal of Biological Chemistry, 278, 3664–3670. doi: 10.1074/jbc.M208155200.
  UniProt Consortium. (2015). Uniprot: A hub for protein information. Nucleic Acids Research, 43, D204–212. doi: 10.1093/nar/gku989.
  Van Roey, K., Dinkel, H., Weatheritt, R. J., Gibson, T. J., & Davey, N. E. (2013). The switches.elm resource: A compendium of conditional regulatory interaction interfaces. Science Signaling, 6, rs7. doi: 10.1126/scisignal.2003345.
  Van Roey, K., Gibson, T. J., & Davey, N. E. (2012). Motif switches: Decision‐making in cell regulation. Current Opinion in Structural Biology, 22, 378–385. doi: 10.1016/j.sbi.2012.03.004.
  Van Roey, K., Uyar, B., Weatheritt, R. J., Dinkel, H., Seiler, M., Budd, A., … Davey, N. E. (2014). Short linear motifs: Ubiquitous and functionally diverse protein interaction modules directing cell regulation. Chemical Reviews, 114, 6733–6778. doi: 10.1021/cr400585q.
  Via, A., Gould, C. M., Gemünd, C., Gibson, T. J., & Helmer‐Citterich, M. (2009). A structure filter for the eukaryotic linear motif resource. BMC Bioinformatics, 10, 351. doi: 10.1186/1471‐2105‐10‐351.
  Via, A., Uyar, B., Brun, C., & Zanzoni, A. (2015). How pathogens use linear motifs to perturb host cell networks. Trends in Biochemical Sciences, 40, 36–48. doi: 10.1016/j.tibs.2014.11.001.
  Waterhouse, A. M., Procter, J. B., Martin, D. M. A., Clamp, M., & Barton, G. J. (2009). Jalview version 2‐a multiple sequence alignment editor and analysis workbench. Bioinformatics (Oxford, England), 25, 1189–1191. doi: 10.1093/bioinformatics/btp033.
  Wright, P. E., & Dyson, H. J. (1999). Intrinsically unstructured proteins: Re‐assessing the protein structure‐function paradigm. Journal of Molecular Biology, 293, 321–331. doi: 10.1006/jmbi.1999.3110.
  Zhang, Z., Schäffer, A. A., Miller, W., Madden, T. L., Lipman, D. J., Koonin, E. V., & Altschul, S. F. (1998). Protein sequence similarity searches using patterns as seeds. Nucleic Acids Research, 26, 3986–3990. doi: 10.1093/nar/26.17.3986.
Key References
  Dinkel, H., Van Roey, K., Michael, S., Kumar, M., Uyar, B., Altenberg, B., … Gibson, T. J. (2016). ELM 2016‐data update and new functionality of the eukaryotic linear motif resource. Nucleic Acids Research, 44, D294–300. doi: 10.1093/nar/gkv1291.
  This is the latest publication on the ELM database highlighting the newest features.
  Gibson et al., 2015. See above.
  This guide is meant for experimentalists working on detecting/validating short linear motif instances.
  Davey et al., 2012. See above.
  This review summarizes the biochemical properties of short linear motifs.
  Van Roey et al., 2014. See above.
  Comprehensive review about short linear motifs with extensive biological examples.
Internet Resources
  http://www.clustal.org/omega
  Clustal Omega (Sievers et al., ) is a tool for the alignment of multiple nucleic acid and protein sequences.
  http://www.jalview.org
  Jalview (Waterhouse et al., ) is a Java desktop application (and browser applet) that employs Web services for sequence alignment and visualization.
  http://proviz.ucd.ie
  ProViz (Jehl, Manguy, Shields, Higgins, & Davey, ) is an interactive protein exploration tool, which searches several databases for information about a given query protein. Data relevant to the protein, like an alignment of homologs, linear motifs, post‐translational modifications, domains, secondary structures, sequence variations, and others are graphically represented relative to their position in the protein.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library