Using the Tools and Resources of the RCSB Protein Data Bank

Shuchismita Dutta1, Helen M. Berman1, Wolfgang F. Bluhm2

1 Rutgers University, Piscataway, New Jersey, 2 University of California San Diego, La Jolla, California
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 1.9
DOI:  10.1002/0471250953.bi0109s20
Online Posting Date:  December, 2007
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


The Protein Data Bank (PDB; is the world‐wide repository for three‐dimensional structural data determined using various experimental methods. The options and procedures for searching and downloading structural data from the Research Collaboratory for Structural Bioinformatics (RCSB) PDB are described here, along with tools for assessing the quality of structures. Several types of information are associated with each structure deposition, including atomic coordinates of the structure, experimental data used to solve it, sequences of all macromolecules in the structures, details about the structure solution method, images showing different views of the structure, derived geometric data, and a variety of links to other resources. These data and resources may be used for understanding the function and stability of the molecule and for designing biochemical, genetic, or other experiments. They can also be used for molecular modeling and drug design. Curr. Protoc. Bioinform. 20:1.9.1‐1.9.24. © 2007 by John Wiley & Sons, Inc.

Keywords: query; validation; macromolecular structures; proteins; nucleic acids

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Searching the PDB for a Specific Structure or Group of Structures
  • Basic Protocol 1: Searching from the RCSB PDB Home Page
  • Basic Protocol 2: The Structure Summary Page, Query by Example, and Downloading Structural Data for a Single PDB Entry
  • Basic Protocol 3: Searching the PDB Using the “Advanced Search” Interface
  • Basic Protocol 4: Refining or Modifying a Search from the Query Results Browser
  • Basic Protocol 5: Searching the PDB for Unreleased Structures
  • Basic Protocol 6: Browsing the PDB Using Tree Browsers
  • Downloading Structural Data from the PDB
  • Basic Protocol 7: Web Downloading of Structural Data from the PDB
  • Basic Protocol 8: Downloading Structural Data from the PDB via FTP
  • Searching for a Specific or Class of Chemical Components Present in the PDB
  • Basic Protocol 9: Searching the PDB for a Chemical Component Using the “Advanced Search” Interface
  • Basic Protocol 10: Searching the PDB for a Chemical Component Using a Chemical Drawing Interface
  • Basic Protocol 11: Assessing the Quality of a Macromolecular Structure Using the Validation Server
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

   Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. 2000. The Protein Data Bank. Nucl. Acids Res. 28:235‐242.
   Berman, H.M., Henrick, K., and Nakamura, H. 2003. Announcing the worldwide Protein Data Bank. Nat. Struct. Biol. 10:980.
   Berman, H.M., Burley, S.K., Chiu, W., Sali, A., Adzhubei, A., Bourne, P.E., Bryant, S.H., Roland, J., Dunbrack, L., Fidelis, K., Frank, J., Godzik, A., Henrick, K., Joachimiak, A., Heymann, B., Jones, D., Markley, J.L., Moult, J., Montelione, G.T., Orengo, C., Rossmann, M.G., Rost, B., Saibil, H., Schwede, T., Standley, D.M., and Westbrook, J.D. 2006. Outcome of a workshop on archiving structural models of biological macromolecules. Structure 14:1211‐1217.
   Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F. Jr., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., and Tasumi, M. 1977. Protein Data Bank: A computer‐based archival file for macromolecular structures. J. Mol. Biol. 112:535‐542.
   Bourne, P.E., Berman, H.M., Watenpaugh, K., Westbrook, J.D., and Fitzgerald, P.M.D. 1997. The macromolecular Crystallographic Information File (mmCIF). Meth. Enzymol. 277:571‐590.
   Callaway, J., Cummings, M., Deroski, B., Esposito, P., Forman, A., Langdon, P., Libeson, M., McCarthy, J., Sikora, J., Xue, D., Abola, E., Bernstein, F., Manning, N., Shea, R., Stampf, D., and Sussman, J. 1996. Protein Data Bank contents guide: Atomic coordinate entry format description. Brookhaven National Laboratory, Brookhaven, N.Y.
   Chen, S., Vojtechovsky, J., Parkinson, G.N., Ebright, R.H., and Berman, H.M. 2001. Indirect readout of DNA sequence at the primary‐kink site in the CAP‐DNA complex: DNA binding specificity based on energetics of DNA kinking. J. Mol. Biol. 314:63‐74.
   Clowney, L., Jain, S.C., Srinivasan, A.R., Westbrook, J., Olson, W.K., and Berman, H.M. 1996. Geometric parameters in nucleic acids: Nitrogenous bases. J. Am. Chem. Soc. 118:509‐518.
   Conte, L.L., Brenner, S.E., Hubbard, T.J., Chothia, C., and Murzin, A.G. 2002. SCOP database in 2002: Refinements accommodate Structural Genomics. Nucl. Acids Res. 30:264‐267.
   Deshpande, N., Addess, K.J., Bluhm, W.F., Merino‐Ott, J.C., Townsend‐Merino, W., Zhang, Q., Knezevich, C., Xie, L., Chen, L., Feng, Z., Kramer Green, R., Flippen‐Anderson, J.L., Westbrook, J., Berman, H.M., and Bourne, P.E. 2005. The RCSB Protein Data Bank: A redesigned query system and relational database based on the mmCIF schema. Nucl. Acids Res. 33:D233‐D237.
   Engh, R.A. and Huber, R. 2006. Structure quality and target parameters. In International Tables for Crystallography (S.R. Hall and B. McMahon, eds.), Vol. F, ch. 18.3, pp. 382‐392. Springer, New York.
   Feng, Z., Westbrook, J., and Berman, H.M. 1998. NUCheck. Rutgers University New Brunswick, N.J.
   Feng, Z., Chen, L., Maddula, H., Akcan, O., Oughtred, R., Berman, H.M., and Westbrook, J. 2004. Ligand depot: A data warehouse for ligands bound to macromolecules. Bioinformatics 20:2153‐2155.
   Fitzgerald, P.M.D, Westbrook, J.D., Bourne, P.E., McMahon, B., Watenpaugh, K.D., and Berman, H.M., 2005. The Macromolecular dictionary (mmCIF). In International Tables for Crystallography, Volume G. Definition and exchange of crystallographic data (S.R. Hall and B. McMahon eds.) pp. 295‐443. Springer, Dordrecht, The Netherlands.
   Gelbin, A., Schneider, B., Clowney, L., Hsieh, S.‐H., Olson, W.K., and Berman, H.M. 1996. Geometric parameters in nucleic acids: Sugar and phosphate constituents. J. Am. Chem. Soc. 118:519‐528.
   The Gene Ontology Consortium. 2000. Gene ontology: Tool for the unification of biology. Nat. Genet. 25:25‐29.
   Hempstead, P.D., Yewdall, S.J., Fernie, A.R., Lawson, D.M., Artymiuk, P.J., Rice, D.W., Ford, G.C., and Harrison, P.M. 1997. Comparison of the three‐dimensional structures of recombinant human H and horse L ferritins at high resolution. J. Mol. Biol. 268:424‐448.
   International Union of Crystallography. 1989. Policy on publication and the deposition of data from crystallographic studies of biological macromolecules. Acta Crystallogr. A A45:658
   Kleywegt, G.J. and Jones, T.A. 1996. PHI/PSI‐chology: Ramachandran revisited. Structure 4:1395‐1400.
   Laskowski, R.A., McArthur, M.W., Moss, D.S., and Thornton, J.M. 1993. PROCHECK: A program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26:283‐291.
   Li'ebecq, C., Ed. 1992. Biochemical nomenclature and related documents: A compendium prepared for the committee of Editors of Biochemical Journals. Portland Press, Chapel Hill, N.C.
   Lovell, S.C., Davis, I.W., Arendall, W.B., 3rd, de Bakker, P.I., Word, J.M., Prisant, M.G., Richardson, J.S., and Richardson, D.C., 2003. Structure validation by Calpha geometry: Phi, psi and Cbeta deviation. Proteins 50:437‐450.
   Markley, J.L., Bax, A., Arata, Y., Hilbers, C.W., Kaptein, R., Sykes, B.D., Wright, P.E., and Wuthrich, K. 1998. Recommendations for the presentation of NMR structures of proteins and nucleic acids. IUPAC‐IUBMB‐IUPAB Inter‐Union Task Group on the standardization of data bases of protein and nucleic acid structures determined by NMR spectroscopy. J. Biomol. NMR 12:1‐23.
   Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., and Thornton, J.M. 1997. CATH: A hierarchic classification of protein domain structures. Structure 5:1093‐1108.
   Sayle, R. and Milner‐White, E.J. 1995. RasMol: Biomolecular graphics for all. Trends Biochem. Sci. 20:374.
   Vaguine, A.A., Richelle, J., and Wodak, S.J. 1999. SFCHECK: A unified set of procedures for evaluating the quality of macromolecular structure‐factor data and their agreement with the atomic model. Acta Crystallogr. D D55:191‐205.
   Westbrook, J., Feng, Z., Burkhardt, K., and Berman, H.M. 2003. Validation of protein structures for the Protein Data Bank. Meth. Enzymol. 374:370‐385.
   Westbrook, J., Ito, N., Nakamura, H., Henrick, K., and Berman, H.M. 2005. PDBML: The representation of archival macromolecular structure data in XML. Bioinformatics 21:988‐992.
Key References
   Berman et al., 2000. See above.
  This paper is the original reference for the RCSB PDB and details the goals of the PDB, the systems in place for data deposition and access.
   Fitzgerald et al., 2005. See above.
  This is the latest reference for the mmCIF format, a standard representation for macromolecular structure data derived from various experimental methods.
   Westbrook et al., 2003. See above.
  This reference details the validation procedure used for assessing the quality of structural data using the RCSB Validation Server.
Internet Resources
  The home page for RCSB PDB.
  The Advanced Search interface allows the user to combine a number of specific attributes for querying the PDB.
  Main RCSB PDB FTP site.
  Validation Server at the RCSB PDB.
  A collection of all RCSB PDB software tools.
  A collection of data‐deposition requirements and resources for depositing structures to the PDB.
  The ADIT data‐deposition tool at RCSB PDB.
  Ligand Depot searches the PDB chemical component dictionary.
  The home page for wwPDB with links to documents describing the PDB format guides, and chemical component dictionary.
PDF or HTML at Wiley Online Library