Using MSDchem to Search the PDB Ligand Dictionary

Dimitris Dimitropoulos1, John Ionides1, Kim Henrick1

1 European Bioinformatics Institute, Hinxton, Cambridgeshire
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 14.3
DOI:  10.1002/0471250953.bi1403s15
Online Posting Date:  October, 2006
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


The PDB ligand dictionary is the chemical reference database of all the small building block molecules (e.g., amino acids, nucleic acids, and bound ligands) in the Protein Data Bank (PDB) referenced by a distinct three‐letter code identifier. Since PDB files have only three‐dimensional coordinate data, the role of the dictionary that of a reference resource for the actual chemical properties of small molecules, shared consistently across all PDB entries. The ligand dictionary is maintained in all sites of the Worldwide Protein Data Bank (wwPDB), the Research Collaboratory for Structural Bioinformatics (RCSB) in U.S., the Macromolecular Structure Database (MSD) in Europe, and the Protein Data Bank in Japan (PDBj), and it is exchanged on a regular basis. The MSD group at the European BioInformatics Institute (EBI) extends the dictionary into the MSDchem ligand database, which utilizes chemo‐informatics packages and incorporates additional curation work. MSDchem is publicly available on the Web through the MSDchem search system, the functionality of which is described in more detail in this unit.

Keywords: ligands; organic chemicals; chemical structure; chemical properties; protein structure databases; macromolecular complexes; amino acids; nucleic acids

PDF or HTML at Wiley Online Library

Table of Contents

  • Basic Protocol 1: Searching for Ligands Using the Three‐Letter PDB Code or Molecular Name
  • Basic Protocol 2: Searching for Ligands Using a Formula or Fragment Expression
  • Basic Protocol 3: Performing a Chemical Subgraph Search
  • Basic Protocol 4: Exporting the Ligand Dictionary
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

   Berman, H., Nakamura, H., and Henrick, K. 2005. The Protein Data Bank (PDB) and the WorldWide PDB
   In Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. Section 4.6. (M. Dunn, L. Jorde, P. Little, and S. Subramaniam, eds.) John Wiley & Sons, Hoboken, N.J.
   Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F. Jr., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., and Tasumi, M. 1977. Protein Data Bank: A computer‐based archival file for macromolecular structures. J. Mol. Biol. 112:535‐542.
   Boutselakis, H., Dimitropoulos, D., Henrick, K., Ionides, J., John, M., Keller, P.A., McNeil, P., Pineda, J., and Suarez‐Uruena. A. 2004. The European Bioinformatics Institute macromolecular structure relational database technology. In Database Annotation in Molecular Biology. pp. 223‐240. John Wiley & Sons, Hoboken, N. J.
   Gasteiger, J., Rudolph, C., and Sadowski, J. 1990. Automatic generation of 3D‐atomic coordinates for organic molecules. Tetrahedron Comp. Method. 3:537‐547.
   Golovin, A., Oldfield, T.J., Tate, J.G., Velankar, S., Barton, G.J., Boutselakis, H., Dimitropoulos, D., Fillon, J., Hussain, A., Ionides, J.M.C., John, M., Keller, P.A., Krissinel, E., McNeil, P., Naim, A., Newman, R., Pajon, A., Pineda, J., Rachedi, A., Copeland, J., Sitnov, A., Sobhany, S., Suarez‐Uruena, A., Swaminathan, J., Tagari, M., Tromm, S., Vranken, W., and Henrick, K. 2004. E‐MSD: An integrated data resource for bioinformatics. Nucl. Acids Res. 32:D211‐D216.
   Golovin, A., Dimitropoulos, D., Oldfield, T., Rachedi, A., and Henrick, K. 2005. MSDsite: A database search and retrieval system for the analysis and viewing of bound ligands and active sites. Proteins 58:190‐199.
   Ihlenfeldt, W.D., Takahasi, Y., Abe, H., and Sasaki, S. 1992. CACTVS: A chemistry algorithm development environment. In Daijuukagakutouronkai Dainijuukai Kouzoukasseisoukan Shinpojiumu Kouenyoushishuu (K. Machida and T. Nishioka, eds.) pp. 102‐105. Kyoto University Press, Kyoto, Japan.
   Krissinel, E.B., Winn, M.D., Ballard, C.C., Ashton, A.W., Patel, P., Potterton, E.A., McNicholas, S.J., Cowtan, K.D., and Emsley, P. 2004. The new CCP4 Coordinate Library as a toolkit for the design of coordinate‐related applications in protein crystallography. Acta. Crystallogr. D Biol. Crystallogr. 60:2250‐2255.
   Weininger, D. 1988. SMILES 1. Introduction and encoding rules. J. Chem. Inf. Comput. Sci. 28:31.
   Westbrook, J.D., Henrick, K., Ulrich, E., and Berman, H.M. 2005. Classification and use of macromolecular data. Appendix 3.6.2. The Protein Databank exchange dictionary. In International Tables for Crystallography, Vol. G: Definition and Exchange of Crystallographic Data (S. Hall and B. McMahon, eds.) pp. 195‐197. Springer, Dordrecht, The Netherlands.
Key References
   Berman et al., 2005. See above.
  A description of the wwPDB consortium, its organization, and goals.
   Dutta, S., Burkhardt, K., Bluhm, W.F., and Helen, B. 2006. Using the tools and resources of the RCSB Protein Data Bank. In Current Protocols in Bioinformatics (A.D. Baxevanis, R.D.M. Page, G.A. Petsko, L.D. Stein, and G.D. Stormo, eds.) pp. 1.9.1‐1.9.40. John Wiley & Sons, Hoboken, N. J.
  Explains various concepts about the PDB, the wwPDB, and tools that are provided by the RCSB partner, as well as the corresponding Ligand Depot service databases and suite of Web tools.
   Golovin et al., 2004. See above.
  A consistent overview of the activities and policies of the MSD group at EBI and of the concepts of the MSD.
   Westbrook et al., 2005. See above.
  A description of the process of the wwPDB exchange, which is the basis of the MSDchem database.
Internet Resources‐srv/msdchem
  The MSDchem search home page.
  Contains information about the MSD group and the MSD suite of tools and services.‐srv/msdlite
  The MSDlite search system provides overview atlas pages for PDB entries, using the MSD database.‐srv/msdsite
  The MSDsite Web service that provides details about ligand occurrences and binding sites of small molecules in PDB entries.‐srv/docs/dbdoc
  Contains information about the MSDSD public search relational database and how to download and use it.‐srv/docs/moldoc/help.html
  The molecule subgraph containment package used by the MSDchem search system.‐component‐erf.cif
  The Chemical Component Information dictionary that is exchanged in wwPDB.
  The CACTVS chemistry algorithm development environment, the main software package used by MSDchem database and Web service
  The CORINA Web service for fast and efficient generation of high‐quality 3‐D molecular models used to generate idealized coordinates for ligands.
  The home page of the JME Molecular Editor Java applet used by MSDchem Web service.
  The home page of the Jmol, free, open source 3‐D molecule viewer used by MSDchem Web service.
  Information about the definition of the popular MDL CTfile Formats.
  The ACD‐labs chemical software package used at the time of curation of new ligands.∼ddl/vega/index_noanim.htm
  The VEGA Molecular modeling software package used in the back‐end of the MSDchem database.
PDF or HTML at Wiley Online Library