Minimotif Miner: A Computational Tool to Investigate Protein Function, Disease, and Genetic Diversity

Martin R. Schiller1

1 University of Connecticut Health Center, Farmington, Connecticut
Publication Name:  Current Protocols in Protein Science
Unit Number:  Unit 2.12
DOI:  10.1002/0471140864.ps0212s48
Online Posting Date:  May, 2007
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


The Minimotif Miner Web site contains information on several hundred short functional motifs in a single database, and allows the user to search protein queries for the presence of these motifs. Scoring based on evolutionary conservation, protein surface prediction, and motif frequency can be used in conjunction with other motif programs and the known biology of the query to reduce falseā€positive predictions and select short motifs for experimental pursuit.

Keywords: motif; minimotif; proteome; evolution; disease

PDF or HTML at Wiley Online Library

Table of Contents

  • Mapping and Ranking Motifs with Minimotif Miner
  • Interpretation and Selecting Motifs
  • Mapping Motifs onto Protein Surfaces
  • Using Minimotif Miner to Investigate Disease and Genetic Diversity
  • Discussion
  • Acknowledgements
  • Literature Cited
  • Figures
  • Tables
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

   Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI‐BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25:3389‐3402.
   Balla, S., Thapar, V., Luong, T., Faghri, T., Huang, C.H., Rajasekaran, S., del Campo, J.J., Shin, J.H., Mohler, W.A., Maciejewski, M.W., Gryk, M., Piccirillo, B., Schiller, S.R., and Schiller, M.R. 2006. Minimotif Miner, a tool for investigating protein function. Nat. Methods 3:175‐177.
   Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths‐Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Yeats, C., and Eddy, S.R. 2004. The Pfam protein families database. Nucleic Acids Res. 32:D138‐D141.
   Claustres, M., Horaitis, O., Vanevski, M., and Cotton, R.G.H. 2002. Time for a unified system of mutation description and reporting: A review of locus‐specific mutation Databases. Genome Res. 12:680‐688.
   Cooper, D.N., Ball, E.V., and Krawczak, M. 1998. The human gene mutation database. Nucleic Acids Res. 26:285‐287.
   Cowan‐Jacob, S.W., Fendrich, G., Manley, P.W., Jahnke, W., Fabbro, D., Liebetanz, J., and Meyer, T. 2005. The crystal structure of a c‐Src complex in an active conformation suggests possible steps in c‐Src activation. Structure 13:861‐871.
   Cruts, M., Hendricks, L., and Broeckhoven, C. 1996. The Presenilin genes: A new gene family involved in Alzheimer's disease pathology. Hum. Mol. Genet. 5:1449‐1455.
   Davis, R.J. 1995. Transcriptional regulation by MAP kinases. Mol. Reprod. Dev. 42:459‐467.
   Delano, W.L. 2004. Use of PYMOL as a communications tool for molecular science. Abstr. Pap. Am. Chem. Soc. 228:U313‐U314.
   Falquet, L., Pagni, M., Bucher, P., Hulo, N., Sigrist, C.J., Hofmann, K., and Bairoch, A. 2002. The PROSITE database: Its status in 2002. Nucleic Acids Res. 30:235‐238.
   Koradi, R., Billeter, M., and Wuthrich, K. 1996. MOLMOL: A program for display and analysis of macromolecular structures. J. Mol. Graph. 14:29‐32; 51‐55.
   Lerner, E.C. and Smithgall, T.E. 2002. SH3‐dependent stimulation of Src‐family kinase autophosphorylation without tail release from the SH2 domain in vivo. Nat. Struct. Biol. 9:365‐369.
   Letunic, I., Goodstadt, L., Dickens, N.J., Doerks, T., Schultz, J., Mott, R., Ciccarelli, F., Copley, R.R., Ponting, C.P., and Bork, P. 2002. Recent improvements to the SMART domain‐based sequence annotation resource. Nucleic Acids Res. 30:242‐244.
   Maignan, S., Guilloteau, J.P., Fromage, N., Arnoux, B., Becquart, J., and Ducruix, A. 1995. Crystal structure of the mammalian Grb2 adaptor. Science 268:291‐293.
   Marchler‐Bauer, A., Anderson, J.B., DeWeese‐Scott, C., Fedorova, N.D., Geer, L.Y., He, S., Hurwitz, D.I., Jackson, J.D., Jacobs, A.R., Lanczycki, C.J., Liebert, C.A., Liu, C., Madej, T., Marchler, G.H., Mazumder, R., Nikolskaya, A.N., Panchenko, A.R., Rao, B.S., Shoemaker, B.A., Simonyan, V., Song, J.S., Thiessen, P.A., Vasudevan, S., Wang, Y., Yamashita, R.A., Yin, J.J., and Bryant, S.H. 2003. CDD: A curated Entrez database of conserved domain alignments. Nucleic Acids Res. 31:383‐387.
   Obenauer, J.C., Cantley, L.C., and Yaffe, M.B. 2003. Scansite 2.0: Proteome‐wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res. 31:3635‐3641.
   Pagni, M., Iseli, C., Junier, T., Falquet, L., Jongeneel, V., and Bucher, P. 2001. trEST, trGEN and Hits: Access to databases of predicted protein sequences. Nucleic Acids Res. 29:148‐151.
   Pelham, H.R.B. 1991. Recycling of proteins betweeen the Endoplasmic Reticulum and the Golgi Complex. Curr. Opin. Cell Biol. 3:585‐591.
   Pruitt, K.D. and Maglott, D.R. 2001. RefSeq and LocusLink: NCBI gene‐centered resources. Nucleic Acids Res. 29:137‐140.
   Puntervoll, P., Linding, R., Gemund, C., Chabanis‐Davidson, S., Mattingsdal, M., Cameron, S., Martin, D.M., Ausiello, G., Brannetti, B., Costantini, A., Ferre, F., Maselli, V., Via, A., Cesareni, G., Diella, F., Superti‐Furga, G., Wyrwicz, L., Ramu, C., McGuigan, C., Gudavalli, R., Letunic, I., Bork, P., Rychlewski, L., Kuster, B., Helmer‐Citterich, M., Hunter, W.N., Aasland, R., and Gibson, T.J. 2003. ELM server: A new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res. 31:3625‐3630.
   Schneider, M., Tognolli, M., and Bairoch, A. 2004. The Swiss‐Prot protein knowledgebase and ExPASy: Providing the plant community with high quality proteomic data and tools. Plant Physiol. Biochem. 42:1013‐1021.
   Smith, R.F., Wiese, B.A., Wojzynski, M.K., Davison, D.B., and Worley, K.C. 1996. BCM Search Launcher: An integrated interface to molecular biology data base search and analysis services available on the World Wide Web. Genome Res. 6:454‐462.
   Teufel, A., Krupp, M., Weinmann, A., and Galle, P.R. 2006. Current bioinformatics tools in genomic biomedical research. Int. J. Mol. Med. 17:967‐973.
   Thorson, J.A., Yu, L.W.K., Hsu, A.L., Shih, N.Y., Graves, P.R., Tanner, J.W., Allen, P.M., Piwnica‐Worms, H., and Shaw, A.S. 1998. 14‐3‐3 proteins are required for maintenance of Raf‐1 phosphorylation and kinase activity. Mol. Cell. Biol. 18:5229‐5238.
   Townsley, F.M., Frigerio, G., and Pelham, H.R.B. 1994. Retrieval of Hdel proteins is Required for growth of yeast cells. J. Cell Biol. 127:21‐28.
   Wang, Z. and Moran, M.F. 1996. Requirement for the adapter protein Grb2 in EGF receptor endocytosis. Science 272:19355‐1939.
   Wheeler, D.L., Chappey, C., Lash, A.E., Leipe, D.D., Madden, T.L., Schuler, G.D., Tatusova, T.A., and Rapp, B.A. 2000. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 28:10‐14.
   Yoon, S.Y., Koh, W.S., Lee, M.K., Park, Y.M., and Han, M.Y. 1997. Dynamin II associates with Grb2 SH3 domain in Ras transformed NIH3T3 cells. Biochem. Biophys. Res. Comm. 234:539‐543.
Key References
   Balla et al., 2006. See above.
  Describes the MnM database, algorithms used in scoring, and methods used to validation the scoring approaches.
Internet Resources
  Minimotif Miner (MnM):,Online Web site for identifying short functional motifs.
  PyMOL: Shareware software for viewing motifs in protein structures.
  MolMol: Downloadable software for viewing protein structures.
  Protein Data Bank (PDB): Repository of macromolecular structures.
  McKusick, V.A. 2000. Online Mendelian Inheritance in Man (OMIM). McKusick‐Nathans Institute for Genetic Medicine, Johns Hopkins University, (Baltimore, Md. and National Center for Biotechnology Information, National Library of Medicine, Bethesda, Md.
  ELM: Online Web site for identifying short functional motifs.
  myHITS (pattern search): Searches several sequence databases for motifs.
  BLAST: Searches proteins for short, nearly exact matches.
  SCANSITE: Online Web site for identifying short functional motifs.
  ExPASy: Contains a collection of tools for prediction of specific motifs.
  HGMD: The Human Genome Mutations Database, which contains a large collection of mutations known to be associated with human disease.
PDF or HTML at Wiley Online Library