Introduction to Cheminformatics

David S. Wishart1

1 Departments of Computing Science and Biological Sciences, University of Alberta, Edmonton, Alberta
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 14.1
DOI:  10.1002/0471250953.bi1401s53
Online Posting Date:  March, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Cheminformatics is a field of information technology that focuses on the collection, storage, analysis, and manipulation of chemical data. The chemical data of interest typically includes information on small molecule formulas, structures, properties, spectra, and activities (biological or industrial). Cheminformatics originally emerged as a vehicle to help the drug discovery and development process, however cheminformatics now plays an increasingly important role in many areas of biology, chemistry, and biochemistry. The intent of this unit is to give readers some introduction into the field of cheminformatics and to show how cheminformatics not only shares many similarities with the field of bioinformatics, but also enhances much of what is currently done in bioinformatics, molecular biology, and biochemistry. © 2016 by John Wiley & Sons, Inc.

Keywords: cheminformatics; bioinformatics; metabolomics; drug; chemical

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • The Intersection Between Cheminformatics and Bioinformatics
  • Cheminformatic Data Formats
  • Cheminformatics Utilities
  • Databases in Cheminformatics
  • Predictive Tools in Cheminformatics
  • Analytical Tools in Cheminformatics and Metabolomics
  • Conclusion
  • Acknowledgements
  • Literature Cited
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

Literature Cited
  Allen, F., Pon, A., Wilson, M., Greiner, R., and Wishart, D. 2014. CFM‐ID: A web server for annotation, spectrum prediction and metabolite identification from tandem mass spectra. Nucleic Acids Res. 42:W94‐W99. doi: 10.1093/nar/gku436
  Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403‐410. doi: 10.1016/S0022-2836(05)80360-2
  Benson, D.A., Karsch‐Mizrachi, I., Lipman, D.J., Ostell, J., and Wheeler, D.L. 2005. GenBank. Nucleic Acids Res. 33:D34‐D38. doi: 10.1093/nar/gki063
  Bisson, W.H. 2012. Drug repurposing in chemical genomics: Can we learn from the past to improve the future? Curr. Top. Med. Chem. 12:1883‐1888. doi: 10.2174/156802612804547344
  Bremser, W. 1978. Hose – a novel substructure code. Analytica Chim. Acta. 103:355‐365. doi: 10.1016/S0003-2670(01)83100-7
  Brown, F.K. 1998. Chemoinformatics: What is it and how does it impact drug discovery. In Annual Reports in Medicinal Chemistry, Vol. 33 (D. Robertson, J.J. Plattner, W.K. Hagmann, W.W. Wong, G.L. Trainor, eds.) pp. 375‐384. Academic Press, San Diego.
  Brown, I.D. and McMahon, B. 2002. CIF: The computer language of crystallography. Acta Crystallogr. B . 58:317‐324. doi: 10.1107/S0108768102003464
  Carlsson, L., Spjuth, O., Adams, S., Glen, R.C., and Boyer, S. 2010. Use of historic metabolic biotransformation data as a means of anticipating metabolic sites using MetaPrint2D and Bioclipse. BMC Bioinformatics 11:362. doi: 10.1186/1471-2105-11-362
  Caspi, R. and Karp, P.D. 2007. Using the MetaCyc Pathway Database and the BioCyc Database Collection. Curr. Protoc. Bioinform. 20:1.17.1‐1.17.51. doi: 10.1002/0471250953.bi0117s20
  Croft, D., O'Kelly, G., Wu, G., Haw, R., Gillespie, M., Matthews, L., Caudy, M., Garapati, P., Gopinath, G., Jassal, B., Jupe, S., Kalatskaya, I., Mahajan, S., May, B., Ndegwa, N., Schmidt, E., Shamovsky, V., Yung, C., Birney, E., Hermjakob, H., D'Eustachio, P., and Stein, L. 2011. Reactome: A database of reactions, pathways and biological processes. Nucleic Acids Res. 39:D691‐D697. doi: 10.1093/nar/gkq1018
  Dalby, A., Nourse, J.G., Hounshell, W.D., Gushurst, A.K.I., Grier, D.L., Leland, B.A., and Laufer, J. 1992. Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. J. Chem. Inf. Comput. Sci. 32:244‐255. doi: 10.1021/ci00007a012
  Davies, M., Nowotka, M., Papadatos, G., Dedman, N., Gaulton, A., Atkinson, F., Bellis, L., and Overington, J.P. 2015. ChEMBL web services: Streamlining access to drug discovery data and utilities. Nucleic Acids Res. 43:W612‐W620. doi: 10.1093/nar/gkv352
  Degtyarenko, K., Hastings, J., de Matos, P., and Ennis, M. 2009. ChEBI: An open bioinformatics and cheminformatics resource. Curr. Protoc. Bioinform. 26:14.9.1‐14.9.20. doi: 10.1002/0471250953.bi1409s26
  Deutsch, E.W. 2008. mzML: A single, unifying data format for mass spectrometer output. Proteomics 14:2776‐2777. doi: 10.1002/pmic.200890049
  Durant, J.L., Leland, B.A., Henry, D.R., and Nourse, J.G. 2002. Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 42:1273‐1280. doi: 10.1021/ci010132r
  Dutta, S., Berman, H.M., and Bluhm, W.F. 2007. Using the tools and resources of the RCSB Protein Data Bank. Curr. Protoc. Bioinform. 20:1.9.1‐1.9.24. doi: 10.1002/0471250953.bi0109s20
  Ekins, S., Andreyev, S., Ryabov, A., Kirillov, E., Rakhmatulin, E.A., Bugrim, A., and Nikolskaya, T. 2005. Computational prediction of human drug metabolism. Expert Opin. Drug Metab. Toxicol. 1:303‐324. doi: 10.1517/17425255.1.2.303
  Fahy, E., Sud, M., Cotter, D., and Subramaniam, S. 2007. LIPID MAPS online tools for lipid research. Nucleic Acids Res. 35:W606‐W612. doi: 10.1093/nar/gkm324
  Feng, Z., Chen, L., Maddula, H., Akcan, O., Oughtred, R., Berman, H.M., and Westbrook, J. 2004. Ligand Depot: A data warehouse for ligands bound to macromolecules. Bioinformatics 20:2153‐2155. doi: 10.1093/bioinformatics/bth214
  Filippov, I.V. and Nicklaus, M.C. 2009. Optical structure recognition software to recover chemical information: OSRA, an open source solution. J. Chem. Inf. Model. 49:740‐743. doi: 10.1021/ci800067r
  Forsythe, I.J. and Wishart, D.S. 2009. Exploring human metabolites using the human metabolome database. Curr. Protoc. Bioinform. 25:14.8.1‐14.8.45. doi: 10.1002/0471250953.bi1408s25
  Gao, J., Ellis, L.B., and Wackett, L.P. 2010. The University of Minnesota Biocatalysis/Biodegradation Database: Improving public access. Nucleic Acids Res. 38:D488‐D491. doi: 10.1093/nar/gkp771
  Gillespie, C.S., Wilkinson, D.J., Proctor, C.J., Shanley, D.P., Boys, R.J., and Kirkwood, T.B. 2006. Tools for the SBML community. Bioinformatics 22:628‐629. doi: 10.1093/bioinformatics/btk042
  Golovin, A., Oldfield, T.J., Tate, S., Velankar, S., Barton, G.J., Boutselakis, H., Dimitropoulos, D., Fillon, J., Hussain, A., Ionides, J.M., John, M., Keller, P.A., Krissinel, E., McNeil, P., Naim, A., Newman, R., Pajon, A., Pineda, J., Rachedi, A., Copeland, J., Sitnov, A., Sobhany, S., Suarez‐Uruena, A., Swaminathan, G.J., Tagari, M., Tromm, S., Vranken, W., and Henrick, K. 2004. E‐MSD: An integrated data. Nucleic Acids Res. 32:D211‐D216. doi: 10.1093/nar/gkh078
  Gong, L., Owen, R.P., Gor, W., Altman, R.B., and Klein, T.E. 2008. PharmGKB: An integrated resource of pharmacogenomic data and knowledge. Curr. Protoc. Bioinform. 23:14.7.1‐14.7.17. doi: 10.1002/0471250953.bi1407s23
  Goto, S., Bono, H., Ogata, H., Fujibuchi, W., Nishioka, T., Sato, K., and Kanehisa, M. 1997. Organizing and computing metabolic pathway data in terms of binary relations. Pac. Symp. Biocomput. 1997:175‐186.
  Gowda, H., Ivanisevic, J., Johnson, C.H., Kurczy, M.E., Benton, H.P., Rinehart, D., Nguyen, T., Ray, J., Kuehl, J., Arevalo, B., Westenskow, P.D., Wang, J.H., Arkin, A.P., Deutschbauer, A.M., Patti, G.J., and Siuzdak, G. 2014. Interactive XCMS Online: Simplifying advanced metabolomic data processing and subsequent statistical analyses. Anal. Chem. 86:6931‐6939. doi: 10.1021/ac500734c
  Greene, N., Judson, P.N., Langowski, J.J., and Marchant, C.A. 1999. Knowledge‐based expert systems for toxicity and metabolism prediction: DEREK, StAR and METEOR. SAR QSAR Environ. Res. 10:299‐314. doi: 10.1080/10629369908039182
  Grosdidier, A., Zoete V., and Michielin, O. 2011. SwissDock, a protein‐small molecule docking web service based on EADock DSS. Nucleic Acids Res. 39:W270‐W277. doi: 10.1093/nar/gkr366
  Guha, R., Howard, M.T, Hutchison, G.R., Murray‐Rust, P., Rzepa, H., Steinbeck, C., Wegner, J., and Willighagen, E.L. 2006. The Blue Obelisk‐interoperability in chemical informatics. J. Chem. Inf. Model. 46:991‐998. doi: 10.1021/ci050400b
  Guo, A.C, Jewison, T., Wilson, M., Liu, Y., Knox, C., Djoumbou, Y., Lo, P., Mandal, R., Krishnamurthy, R., and Wishart D.S. 2013. ECMDB: The E. coli Metabolome Database. Nucleic Acids Res. 41:D625‐D630. doi: 10.1093/nar/gks992
  Hao, J., Liebeke, M., Astle, W., De Iorio, M., Bundy, J.G., and Ebbels, T.M. 2014. Bayesian deconvolution and quantification of metabolites in complex 1D NMR spectra using BATMAN. Nat. Protoc. 9:1416‐1427. doi: 10.1038/nprot.2014.090
  Hastings, J., de Matos, P., Dekker, A., Ennis, M., Harsha, B., Kale, N., Muthukrishnan, V., Owen, G., Turner, S., Williams, M., and Steinbeck, C. 2013. The ChEBI reference database and ontology for biologically relevant chemistry: Enhancements for 2013. Nucleic Acids Res. 41:D456‐D463. doi: 10.1093/nar/gks1146
  Haug, K., Salek, R.M., Conesa, P., Hastings, J., de Matos, P., Rijnbeek, M., Mahendraker, T., Williams, M., Neumann, S., Rocca‐Serra, P., Maguire, E., Gonzalez‐Beltran, A., Sansone, S.A., Griffin, J.L., and Steinbeck, C. 2013. MetaboLights—an open‐access general‐purpose repository for metabolomics studies and associated meta‐data. Nucleic Acids Res. 41:D781‐D786. doi: 10.1093/nar/gks1004
  Haw, R. and Stein, L. 2012. Using the Reactome Database. Curr. Protoc. Bioinform. 38:8.7.1‐8.7.23. doi: 10.1002/0471250953.bi0807s38
  Hecker, N., Ahmed, J., von Eichborn, J., Dunkel, M., Macha, K., Eckert, A., Gilson, M.K., Bourne, P.E., and Preissner, R. 2012. SuperTarget goes quantitative: Update on drug‐target interactions. Nucleic Acids Res. 40:D1113‐D1117. doi: 10.1093/nar/gkr912
  Heinonen, M., Shen, H., Zamboni, N., and Rousu, J. 2012. Metabolite identification and molecular fingerprint prediction through machine learning. Bioinformatics 28:2333‐2341. doi: 10.1093/bioinformatics/bts437
  Heller, S.R., McNaught, A., Pletnev, I., Stein, S., and Tchekhovskoi, D. 2015. InChI, the IUPAC International Chemical Identifier. J. Cheminform. 7:23. doi: 10.1186/s13321-015-0068-4
  Herráez, A. 2006. Biomolecules in the computer: Jmol to the rescue. Biochem. Mol. Biol. Educ. 34:255‐261. doi: 10.1002/bmb.2006.494034042644
  Herzog, R., Schuhmann, K., Schwudke, D., Sampaio, J.L., Bornstein, S.R., Schroeder, M., and Shevchenko, A. 2012. LipidXplorer: A software for consensual cross‐platform lipidomics. PLoS One 7:e29851. doi: 10.1371/journal.pone.0029851
  Herzog, R., Schwudke, D., and Shevchenko, A. 2013. LipidXplorer: software for quantitative shotgun lipidomics compatible with multiple mass spectrometry platforms. Curr. Protoc. Bioinform. 14:14.12.1‐14.12.30. doi: 10.1002/0471250953.bi1412s43
  Hood, L. and Tian, Q. 2012. Systems approaches to biology and disease enable translational systems medicine. Genomics Proteomics Bioinformatics 10:181‐185. doi: 10.1016/j.gpb.2012.08.004
  Horai, H., Arita, M., Kanaya, S., Nihei, Y., Ikeda, T., Suwa, K., Ojima, Y., Tanaka, K., Tanaka, S., Aoshima, K., Oda, Y., Kakazu, Y., Kusano, M., Tohge, T., Matsuda, F., Sawada, Y., Hirai, M.Y., Nakanishi, H., Ikeda, K., Akimoto, N., Maoka, T., Takahashi, H., Ara, T., Sakurai, N., Suzuki, H., Shibata, D., Neumann, S., Iida, T., Tanaka, K., Funatsu, K., Matsuura, F., Soga, T., Taguchi, R., Saito, K., and Nishioka, T. 2010. MassBank: A public repository for sharing mass spectral data for life sciences. J. Mass Spectrom. 45:703‐714. doi: 10.1002/jms.1777
  Irwin, J.J. and Shoichet, B.K. 2005. ZINC‐a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 45:177‐182. doi: 10.1021/ci049714+
  Irwin, J.J., Shoichet, B.K., Mysinger, M.M., Huang, N., Colizzi, F., Wassam, P., and Cao, Y. 2009. Automated docking screens: A feasibility study. J. Med. Chem. 52:5712‐5720. doi: 10.1021/jm9006966
  Jewison, T., Knox, C., Neveu, V., Djoumbou, Y., Guo, A.C., Lee, J., Liu, P., Mandal, R., Krishnamurthy, R., Sinelnikov, I., Wilson, M., and Wishart, D.S. 2012. YMDB: The Yeast Metabolome Database. Nucleic Acids Res. 40:D815‐D820. doi: 10.1093/nar/gkr916
  Jewison, T., Su, Y., Disfany, F.M., Liang, Y., Knox, C., Maciejewski, A., Poelzer, J., Huynh, J., Zhou, Y., Arndt, D., Djoumbou, Y., Liu, Y., Deng, L., Guo, A.C., Han, B., Pon, A., Wilson, M., Rafatnia, S., Liu, P., and Wishart, D.S. 2014. SMPDB 2.0: Big improvements to the small molecule pathway database. Nucleic Acids Res. 42:D478‐D484. doi: 10.1093/nar/gkt1067
  Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M., and Tanabe, M. 2014. Data, information, knowledge and principle: Back to metabolism in KEGG. Nucleic Acids Res. 42:D199‐D205. doi: 10.1093/nar/gkt1076
  Karp, P.D., Riley, M., Saier, M., Paulsen, I.T., Paley, S.M., and Pellegrini‐Toole, A. 2000. The EcoCyc and MetaCyc databases. Nucleic Acids Res. 28:56‐59. doi: 10.1093/nar/28.1.56
  Kelder, T., van Iersel, M.P., Hanspers, K., Kutmon, M., Conklin, B.R., Evelo, C.T., and Pico, A.R. 2012. WikiPathways: Building research communities on biological pathways. Nucleic Acids Res. 40:D1301‐D1307. doi: 10.1093/nar/gkr1074
  Kind, T., Liu, K.‐H., Lee, D.Y., DeFelice, B., Meissen, J.K., and Fiehn, O. 2013. LipidBlast in silico tandem mass spectrometry database for lipid identification. Nat. Methods 10:755‐758. doi: 10.1038/nmeth.2551
  Kopka, J., Schauer, N., Krueger, S., Birkemeyer, C., Usadel, B., Bergmüller, E., Dörmann, P., Weckwerth, W., Gibon, Y., Stitt, M., Willmitzer, L., Fernie, A.R., and Steinhauser, D. 2005. GMD@CSB.DB: The golm metabolome database. Bioinformatics 21:1635‐1638. doi: 10.1093/bioinformatics/bti236
  Kuhn, S., Helmus, T., Lancashire, R.J., Murray‐Rust, P., Rzepa, H.S., Steinbeck, C., and Willighagen, E.L. 2007. Chemical markup, XML, and the World Wide Web. 7. CMLSpect, an XML vocabulary for spectral data. J. Chem. Inf. Mod. 47:2015‐2034. doi: 10.1021/ci600531a
  Ladunga, I. 2009a. Finding similar nucleotide sequences using Network BLAST searches. Curr. Protoc. Bioinform. 26:3.3.1‐3.3.26. doi: 10.1002/0471250953.bi0303s26
  Ladunga, I. 2009b. Finding homologs in amino acid sequences using Network BLAST searches. Curr. Protoc. Bioinform. 25:3.4.1‐3.4.34. doi: 10.1002/0471250953.bi0304s25
  Langowski, J. and Long, A. 2002. Computer systems for the prediction of xenobiotic metabolism. Adv. Drug Deliv. Rev. 54:407‐415. doi: 10.1016/S0169-409X(02)00011-X
  Law, V., Knox, C., Djoumbou, Y., Jewison, T., Guo, A.C., Liu, Y., Maciejewski, A., Arndt, D., Wilson, M., Neveu, V., Tang, A., Gabriel, G., Ly, C., Adamjee, S., Dame, Z.T., Han, B., Zhou, Y., and Wishart, D.S. 2014. DrugBank 4.0: Shedding new light on drug metabolism. Nucleic Acids Res. 42:D1091‐D1097. doi: 10.1093/nar/gkt1068
  Lederberg, J., Sutherland, G.L., Buchanan, B.G., Feigenbaum, E.A., Robertson, A.V., Duffield, A.M., and Djerassi, C. 1969. Application of artificial intelligence to chemical inference. J. Am. Chem. Soc. 91:2973‐2976. doi: 10.1021/ja01039a025
  Lipinski, C.A., Lombardo, F., Dominy, B.W., and Feeney, P.J. 2001. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 46:3‐26. doi: 10.1016/S0169-409X(00)00129-0
  Lowe, D.M., Corbett, P.T., Murray‐Rust, P., and Glen, R.C. 2011. Chemical name to structure: OPSIN, an open source solution. J. Chem. Inf. Model. 51:739‐753. doi: 10.1021/ci100384d
  Lundgren, D.H., Martinez, H., Wright, M.E., and Han, D.K. 2009. Protein identification using Sorcerer 2 and SEQUEST. Curr. Protoc. Bioinform. 28:13.3.1‐13.3.21. doi: 10.1002/0471250953.bi1303s28
  Markley, J.L., Ulrich, E.L., Berman, H.M., Henrick, K., Nakamura, H., and Akutsu, H. 2008. BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): New policies affecting biomolecular NMR depositions. J. Biomol. NMR. 40:153‐155. doi: 10.1007/s10858-008-9221-y
  McDonagh, E.M., Whirl‐Carrillo, M., Garten, Y., Altman, R.B., and Klein, T.E. 2011. From pharmacogenomic knowledge acquisition to clinical applications: The PharmGKB as a clinical pharmacogenomic biomarker resource. Biomark. Med. 5:795‐806. doi: 10.2217/bmm.11.94
  McDonald, R.S. and Wilks, P.A. 1988. JCAMP‐DX: A standard form for exchange of infrared spectra in computer‐readable form. App. Spectrosc. 42:151‐162. doi: 10.1366/0003702884428734
  McGuffin, L.J., Bryson, K., and Jones, D.T. 2000. The PSIPRED protein structure prediction server. Bioinformatics 16:404‐405. doi: 10.1093/bioinformatics/16.4.404
  Morris, G.M., Huey, R., Lindstrom, W., Sanner, M.F., Belew, R.K., Goodsell, D.S., and Olson, A.J. 2009. Autodock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 16:2785‐2791. doi: 10.1002/jcc.21256
  Morris, G.M., Huey, R., and Olson, A.J. 2008. Using AutoDock for ligand‐receptor docking. Curr. Protoc. Bioinform. 24:8.14.1‐8.14.40. doi: 10.1002/0471250953.bi0814s24
  Nakamura, K., Shimura, N., Otabe, Y., Hirai‐Morita, A., Nakamura, Y., Ono, N., Ul‐Amin, M.A., and Kanaya, S. 2013. KNApSAcK‐3D: A three‐dimensional structure database of plant metabolites. Plant Cell. Physiol. 54:e4. doi: 10.1093/pcp/pcs186
  O'Boyle, N.M., Banck, M., James, C.A., Morley, C., Vandermeersch, T., and Hutchison, G.R. 2011a. Open Babel: An open chemical toolbox. J. Cheminform. 3:33. doi: 10.1186/1758-2946-3-33
  O'Boyle, N.M., Guha, R., Willighagen, E.L., Adams, S.E., Alvarsson, J., Bradley, J.C., Filippov, I.V., Hanson, R.M., Hanwell, M.D., Hutchison, G.R., James, C.A., Jeliazkova, N., Lang, A.S., Langner, K.M., Lonie, D.C., Lowe, D.M., Pansanel, J., Pavlov, D., Spjuth, O., Steinbeck, C., Tenderholt, A.L., Theisen, K.J., and Murray‐Rust, P. 2011b. Open data, open source and open standards in chemistry: The Blue Obelisk five years on. J. Cheminform. 3:37. doi: 10.1186/1758-2946-3-37
  Pearson, W. 2004. Finding protein and nucleotide similarities with FASTA. Curr. Protoc. Bioinform. 8:3.9.1‐3.9.23. doi: 10.1002/0471250953.bi0309s04
  Pluskal, T., Castillo, S., Villar‐Briones, A., and Oresic, M. 2010. MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry‐based molecular profile data. BMC Bioinformatics 11:395. doi: 10.1186/1471-2105-11-395
  Pundir, S., Magrane, M., Martin, M.J., O'Donovan, C., and The UniProt Consortium. 2015. Searching and navigating UniProt databases. Curr. Protoc. Bioinform. 50:1.27.1‐1.27.10. doi: 10.1002/0471250953.bi0127s50
  Qin, C., Zhang, C., Zhu, F., Xu, F., Chen, S.Y., Zhang, P., Li, Y.H., Yang, S.Y., Wei, Y.Q., Tao, L., and Chen, Y.Z. 2014. Therapeutic target database update 2014: A resource for targeted therapeutics. Nucleic Acids Res. 42:D1118‐D1123. doi: 10.1093/nar/gkt1129
  Ravanbakhsh, S., Liu, P., Bjorndahl, T.C., Mandal, R., Grant, J.R., Wilson, M., Eisner, R., Sinelnikov, I., Hu, X., Luchinat, C., Greiner, R., and Wishart, D.S. 2015. Accurate, fully‐automated NMR spectral profiling for metabolomics. PLoS One 10:e0124219. doi: 10.1371/journal.pone.0124219
  Rydberg, P., Gloriam, D.E., and Olsen, L. 2010. The SMARTCyp cytochrome P450 metabolism prediction server. Bioinformatics 26:2988‐2989. doi: 10.1093/bioinformatics/btq584
  Singla, D., Dhanda, S.K., Chauhan, J.S., Bhardwaj, A., Brahmachari, S.K., and Raghava, G.P. 2013. Open source software and web services for designing therapeutic molecules. Curr. Top. Med. Chem. 13:1172‐1191. doi: 10.2174/1568026611313100005
  Skrzypek, M.S. and Hirschman, J. 2011. Using the Saccharomyces Genome Database (SGD) for analysis of genomic information. Curr. Protoc. Bioinform. 35:1.20.1‐1.20.23. doi: 10.1002/0471250953.bi0120s35
  Smith, C.A., Want, E.J., O'Maille, G., Abagyan, R., and Siuzdak, G. 2006. XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78:779‐787. doi: 10.1021/ac051437y
  Steinbeck, C., Hoppe, C., Kuhn, S., Floris, M., Guha, R., and Willighagen, E.L. 2006. Recent developments of the Chemistry Development Kit (CDK) An open‐source java library for chemo‐ and bioinformatics. Curr. Pharm. Des. 12:2111‐2120. doi: 10.2174/138161206777585274
  Steinbeck, C. and Kuhn, S. 2004. NMRShiftDB – compound identification and structure elucidation support through a free community‐built web database. Phytochemistry 65:2711‐2717. doi: 10.1016/j.phytochem.2004.08.027
  Strömbäck, L. and Lambrix, P. 2005. Representations of molecular pathways: An evaluation of SBML, PSI MI and BioPAX. Bioinformatics 21:4401‐4407. doi: 10.1093/bioinformatics/bti718
  Tanabe, M. and Kanehisa, M. 2012. Using the KEGG Database Resource. Curr. Protoc. Bioinform. 38:1.12.1‐1.12.43. doi: 10.1002/0471250953.bi0112s38
  Tautenhahn, R., Cho, K., Uritboonthai, W., Zhu, Z., Patti, G.J., and Siuzdak, G. 2012. An accelerated workflow for untargeted metabolomics using the METLIN database. Nat. Biotechnol. 30:826‐828. doi: 10.1038/nbt.2348
  Tetko, I.V., Gasteiger, J., Todeschini, R., Mauri, A., Livingstone, D., Ertl, P., Palyulin, V.A., Radchenko, E.V., Zefirov, N.S., Makarenko, A.S., Tanchuk, V.Y., and Prokopenko, V.V. 2005. Virtual computational chemistry laboratory — design and description. J. Comput. Aid. Mol. Des. 19:453‐463. doi: 10.1007/s10822-005-8694-y
  T'jollyn, H., Boussery, K., Mortishire‐Smith, R.J., Coe, K., De Boeck, B., Van Bocxlaer, J.F., and Mannens, G. 2011. Evaluation of three state‐of‐the‐art metabolite prediction software packages (Meteor, MetaSite, and StarDrop) through independent and synergistic use. Drug Metab. Dispos. 39:2066‐2075. doi: 10.1124/dmd.111.039982
  Tetko, I.V. and Tanchuk, V.Y. 2002. Application of associative neural networks for prediction of lipophilicity in ALOGPS 2.1 program. J. Chem. Inf. Comput. Sci. 42:1136‐1145. doi: 10.1021/ci025515j
  van der Hooft, J.J., Ridder, L., Barrett, M.P., and Burgess, K.E. 2015. Enhanced acylcarnitine annotation in high‐resolution mass spectrometry data: Fragmentation analysis for the classification and annotation of acylcarnitines. Front. Bioeng. Biotechnol. 3:26. doi: 10.3389/fbioe.2015.00026
  van Iersel, M.P., Villéger, A.C., Czauderna, T., Boyd, S.E., Bergmann, F.T., Luna, A., Demir, E., Sorokin, A., Dogrusoz, U., Matsuoka, Y., Funahashi, A., Aladjem, M.I., Mi, H., Moodie, S.L., Kitano, H., Le Novère, N., and Schreiber, F. 2012. Software support for SBGN maps: SBGN‐ML and LibSBGN. Bioinformatics 28:2016‐2021. doi: 10.1093/bioinformatics/bts270
  Weininger, D. 1988. SMILES 1. Introduction and encoding rules. J. Chem. Inf. Comput. Sci. 28:31‐38. doi: 10.1021/ci00057a005
  Westbrook, J., Feng, Z., Jain, S., Bhat, T.N., Thanki, N., Ravichandran, V., Gilliland, G.L., Bluhm, W., Weissig, H., Greer, D.S., Bourne, P.E., and Berman, H.M. 2002. The Protein Data Bank: Unifying the archive. Nucleic Acids Res. 30:245‐248. doi: 10.1093/nar/30.1.245
  Wheeler, D.L., Barrett, T., Benson, D.A., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Edgar, R., Federhen, S., Geer, L.Y., Helmberg, W., Kapustin, Y., Kenton, D.L., Khovayko, O., Lipman, D.J., Madden, T.L., Maglott, D.R., Ostell, J., Pruitt, K.D., Schuler, G.D., Schriml, L.M., Sequeira, E., Sherry, S.T., Sirotkin, K., Souvorov, A., Starchenko, G., Suzek, T.O., Tatusov, R., Tatusova, T.A., Wagner, L., and Yaschenko, E. 2006. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 34:D173‐D180. doi: 10.1093/nar/gkj158
  Williams, A.J. 2008. Public chemical compound databases. Curr. Opin. Drug Discov. Devel. 11:393‐404.
  Wishart, D., Arndt, D., Pon, A., Sajed, T., Guo, A.C., Djoumbou, Y., Knox, C., Wilson, M., Liang, Y., Grant, J., Liu, Y., Goldansaz, S.A., and Rappaport, S.M. 2015. T3DB: The toxic exposome database. Nucleic Acids Res. 43:D928‐D934. doi: 10.1093/nar/gku1004
  Wishart, D.S. 2007. In silico drug exploration and discovery using DrugBank. Curr. Protoc. Bioinform. 18:14.4.1‐14.4.32. doi: 10.1002/0471250953.bi1404s18
  Wishart, D.S., Jewison, T., Guo, A.C., Wilson, M., Knox, C., Liu, Y., Djoumbou, Y., Mandal, R., Aziat, F., Dong, E., Bouatra, S., Sinelnikov, I., Arndt, D., Xia, J., Liu, P., Yallou, F., Bjorndahl, T., Perez‐Pineiro, R., Eisner, R., Allen, F., Neveu, V., Greiner, R., and Scalbert, A. 2013. HMDB 3.0–The human metabolome database in 2013. Nucleic Acids Res. 41:D801‐D807. doi: 10.1093/nar/gks1065
  Wolf, S., Schmidt, S., Müller‐Hannemann, M., and Neumann, S. 2010. In silico fragmentation for computer assisted identification of metabolite mass spectra. BMC Bioinformatics 11:148. doi: 10.1186/1471-2105-11-148
  Xia, J., Bjorndahl, T.C., Tang, P., and Wishart, D.S. 2008. MetaboMiner – semi‐automated identification of metabolites from 2D NMR spectra of complex biofluids. BMC Bioinformatics 9:507. doi: 10.1186/1471-2105-9-507
  Xia, J., Psychogios, N., Young, N., and Wishart, D.S. 2009. MetaboAnalyst: A web server for metabolomic data analysis and interpretation. Nucleic Acids Res. 37:W652‐W660. doi: 10.1093/nar/gkp356
  Xia, J., Sinelnikov, I.V., Han, B., and Wishart, D.S. 2015. MetaboAnalyst 3.0—making metabolomics more meaningful. Nucleic Acids Res. 43:W251‐W257. doi: 10.1093/nar/gkv380
  Xia, J. and Wishart, D.S. 2010a. MSEA: A web‐based tool to identify biologically meaningful patterns in quantitative metabolomic data. Nucleic Acids Res. 38:W71‐W77. doi: 10.1093/nar/gkq329
  Xia, J. and Wishart, D.S. 2010b. MetPA: A web‐based metabolomics tool for pathway analysis and visualization. Bioinformatics 26:2342‐2344. doi: 10.1093/bioinformatics/btq418
  Xia, J. and Wishart, D.S. 2011. Metabolomic data processing, analysis, and interpretation using MetaboAnalyst. Curr. Protoc. Bioinform. 34:14.10.1‐14.10.48. doi: 10.1002/0471250953.bi1410s34
  Zaretzki, J., Bergeron, C., Huang, T.W., Rydberg, P., Swamidass, S.J., and Breneman, C.M. 2013. RS‐WebPredictor: A server for predicting CYP‐mediated sites of metabolism on drug‐like molecules. Bioinformatics 29:497‐498. doi: 10.1093/bioinformatics/bts705
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library