Using the Tools and Resources of the RCSB Protein Data Bank

Luigi Di Costanzo1, Sutapa Ghosh1, Christine Zardecki1, Stephen K. Burley2

1 RCSB Protein Data Bank, Department of Chemistry and Chemical Biology and Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, New Jersey, 2 San Diego Supercomputer Center, University of California, San Diego, California
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 1.9
DOI:  10.1002/cpbi.13
Online Posting Date:  September, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


The Protein Data Bank (PDB) archive is the worldwide repository of experimentally determined three‐dimensional structures of large biological molecules found in all three kingdoms of life. Atomic‐level structures of these proteins, nucleic acids, and complex assemblies thereof are central to research and education in molecular, cellular, and organismal biology, biochemistry, biophysics, materials science, bioengineering, ecology, and medicine. Several types of information are associated with each PDB archival entry, including atomic coordinates, primary experimental data, polymer sequence(s), and summary metadata. The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) serves as the U.S. data center for the PDB, distributing archival data and supporting both simple and complex queries that return results. These data can be freely downloaded, analyzed, and visualized using RCSB PDB tools and resources to gain a deeper understanding of fundamental biological processes, molecular evolution, human health and disease, and drug discovery. © 2016 by John Wiley & Sons, Inc.

Keywords: search; macromolecule; ligand; 3D structure; drug discovery; single nucleotide variation

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Searching for HIV‐1 Protease Structures Using the RCSB PDB WEB SITE Home Page
  • Basic Protocol 2: Exploring an HIV‐1 Protease Entry Using the Structure Summary Page
  • Basic Protocol 3: Using Advanced Search to Build Complex Queries
  • Basic Protocol 4: Browsing Structures Organized by Annotations
  • Basic Protocol 5: Searching by Sequences
  • Basic Protocol 6: Explore Small‐Molecule Ligands in the PDB
  • Basic Protocol 7: Searching for Drugs and Drug Targets in the PDB Archive
  • Basic Protocol 8: Visualize and Analyze the Contents of a PDB Archival Entry
  • Basic Protocol 9: Downloading Multiple Structure and Sequence Data Files from the PDB
  • Basic Protocol 10: Learn about Biology and Medicine Using PDB Structures
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
  Ahmed, A., Smith, R.D., Clark, J.J., Dunbar, J.B. Jr., and Carlson, H.A. 2015. Recent improvements to binding MOAD: A resource for protein‐ligand binding affinities and structures. Nucleic Acids Res. 43:D465‐469. doi: 10.1093/nar/gku1088.
  Alexandrov, N. and Shindyalov, I. 2003. PDP: Protein domain parser. Bioinformatics 19:429‐430. doi: 10.1093/bioinformatics/btg006.
  Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403‐410. doi: 10.1016/S0022‐2836(05)80360‐2.
  Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI‐BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25:3389‐3402. doi: 10.1093/nar/25.17.3389.
  Andreeva, A., Howorth, D., Chothia, C., Kulesha, E., and Murzin, A.G. 2015. Investigating protein structure and evolution with SCOP2. Curr. Protoc. Bioinform. 49:1.26.1‐1.26.21. doi: 10.1002/0471250953.bi0126s49.
  Andreeva, A., Howorth, D., Chandonia, J.M., Brenner, S.E., Hubbard, T.J., Chothia, C., and Murzin, A.G. 2008. Data growth and its impact on the SCOP database: New developments. Nucleic Acids Res. 36:D419‐425. doi: 10.1093/nar/gkm993.
  Arenz, S., Meydan, S., Starosta, A.L., Berninghausen, O., Beckmann, R., Vazquez‐Laslop, N., and Wilson, D.N. 2014a. Drug sensing by the ribosome induces translational arrest via active site perturbation. Mol. Cell 56:446‐452. doi: 10.1016/j.molcel.2014.09.014.
  Arenz, S., Ramu, H., Gupta, P., Berninghausen, O., Beckmann, R., Vazquez‐Laslop, N., Mankin, A.S., and Wilson, D.N. 2014b. Molecular basis for erythromycin‐dependent ribosome stalling during translation of the ErmBL leader peptide. Nat. Commun. 5:3501. doi: 10.1038/ncomms4501.
  Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy, S.R., Griffiths‐Jones, S., Howe, K.L., Marshall, M., and Sonnhammer, E.L. 2002. The Pfam protein families database. Nucleic Acids Res. 30:276‐280. doi: 10.1093/nar/30.1.276.
  Baxevanis, A.D. 2012. Searching online mendelian inheritance in man (OMIM) for information on genetic loci involved in human disease. Curr. Protoc. Bioinform. 37:1.2:1.2.1–1.2.10.
  Berman, H.M., Henrick, K., and Nakamura, H. 2003. Announcing the worldwide Protein Data Bank. Nat. Struct. Biol. 10:980. doi: 10.1038/nsb1203‐980.
  Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. 2000. The Protein Data Bank. Nucleic Acids Res. 28:235‐242. doi: 10.1093/nar/28.1.235.
  Blake, J. A. and Harris, M. A. 2008. The Gene Ontology (GO) project: Structured vocabularies for molecular biology and their application to genome and expression analysis. Curr. Protoc. Bioinform. 23:7.2:7.2.1–7.2.9. doi: 10.1002/0471250953.bi0702s23.
  Brilot, A.F., Korostelev, A.A., Ermolenko, D.N., and Grigorieff, N. 2013. Structure of the ribosome with elongation factor G trapped in the pretranslocation state. Proc. Natl. Acad. Sci. U.S.A. 110:20994‐20999. doi: 10.1073/pnas.1311423110.
  Chen, Z., Li, Y., Chen, E., Hall, D.L., Darke, P.L., Culberson, C., Shafer, J.A., and Kuo, L.C. 1994. Crystal structure at 1.9‐A resolution of human immunodeficiency virus (HIV) II protease complexed with L‐735,524, an orally bioavailable inhibitor of the HIV proteases. J. Biol. Chem. 269:26344‐26348.
  Chen, V.B., Arendall, W.B. 3rd, Headd, J.J., Keedy, D.A., Immormino, R.M., Kapral, G.J., Murray, L.W., Richardson, J.S., and Richardson, D.C. 2010. MolProbity: All‐atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 66:12‐21. doi: 10.1107/S0907444909042073.
  Church, D.M., Schneider, V.A., Graves, T., Auger, K., Cunningham, F., Bouk, N., Chen, H.C., Agarwala, R., McLaren, W.M., Ritchie, G.R., Albracht, D., Kremitzki, M., Rock, S., Kotkiewicz, H., Kremitzki, C., Wollam, A., Trani, L., Fulton, L., Fulton, R., Matthews, L., Whitehead, S., Chow, W., Torrance, J., Dunn, M., Harden, G., Threadgold, G., Wood, J., Collins, J., Heath, P., Griffiths, G., Pelan, S., Grafham, D., Eichler, E.E., Weinstock, G., Mardis, E.R., Wilson, R.K., Howe, K., Flicek, P., and Hubbard, T. 2011. Modernizing reference genome assemblies. PLoS Biol. 9:e1001091. doi: 10.1371/journal.pbio.1001091.
  Coggill, P., Finn, R. D., and Bateman, A. 2008. Identifying protein domains with the Pfam database. Curr. Protoc. Bioinform. 23:2.5:2.5.1–2.5.17. doi: 10.1002/0471250953.bi0205s23.
  Di Costanzo, L., Drury, J.E., Penning, T.M., and Christianson, D.W. 2008. Crystal structure of human liver Delta4‐3‐ketosteroid 5beta‐reductase (AKR1D1) and implications for substrate binding and catalysis. J. Biol. Chem. 283:16830‐16839. doi: 10.1074/jbc.M801778200.
  Down, T.A., Piipari, M., and Hubbard, T.J. 2011. Dalliance: Interactive genome viewing on the web. Bioinformatics 27:889‐890. doi: 10.1093/bioinformatics/btr020.
  Dutta, S., M Berman, H., and F.Bluhm, W. 2007. Using the tools and resources of the RCSB Protein Data Bank. Curr. Protoc. Bioinform. 20:1.9:1.9.1–1.9.24. doi: 10.1002/0471250953.bi0109s20.
  Fermi, G., Perutz, M.F., Shaanan, B., and Fourme, R. 1984. The crystal structure of human deoxyhaemoglobin at 1.74 A resolution. J. Mol. Biol. 175:159‐174. doi: 10.1016/0022‐2836(84)90472‐8.
  Fernández‐Suárez, X. M. and Schuster, M. K. 2010. Using the ensembl genome server to browse genomic sequence data. Curr. Protoc. Bioinform. 30:1.15:1.15.1–1.15.48. doi: 10.1002/0471250953.bi0115s30.
  Gabanyi, M.J., Adams, P.D., Arnold, K., Bordoli, L., Carter, L.G., Flippen‐Andersen, J., Gifford, L., Haas, J., Kouranov, A., McLaughlin, W.A., Micallef, D.I., Minor, W., Shah, R., Schwede, T., Tao, Y.P., Westbrook, J.D., Zimmerman, M., and Berman, H.M. 2011. The structural biology knowledgebase: A portal to protein structures, sequences, functions, and methods. J. Struct. Funct. Genomics 12:45‐54. doi: 10.1007/s10969‐011‐9106‐2.
  Galperin, M.Y., Rigden, D.J., and Fernandez‐Suarez, X.M. 2015. The 2015 nucleic acids research database issue and molecular biology database collection. Nucleic Acids Res. 43:D1‐5. doi: 10.1093/nar/gku1241.
  Gore, S., Velankar, S., and Kleywegt, G.J. 2012. Implementing an X‐ray validation pipeline for the Protein Data Bank. Acta Crystallogr. D Biol. Crystallogr. 68:478‐483. doi: 10.1107/S0907444911050359.
  Gray, K.A., Yates, B., Seal, R.L., Wright, M.W., and Bruford, E.A. 2015. The HGNC resources in 2015. Nucleic Acids Res. 43:D1079‐1085. doi: 10.1093/nar/gku1071.
  Haas, J., Roth, S., Arnold, K., Kiefer, F., Schmidt, T., Bordoli, L., and Schwede, T. 2013. The Protein Model Portal—a comprehensive resource for protein structure and model information. Database (Oxford) 2013:bat031. doi: 10.1093/database/bat031.
  Hare, S., Smith, S.J., Métifiot, M., Jaxa‐Chamiec, A., Pommier, Y., Hughes, S.H., and Cherepanov P. 2011. Structural and functional analyses of the second‐generation integrase strand transfer inhibitor dolutegravir (S/GSK1349572). Mol. Pharmacol. 80:565‐572. doi: 10.1124/mol.111.073189. Epub 2011 Jun 30.
  Harrow, J., Frankish, A., Gonzalez, J.M., Tapanari, E., Diekhans, M., Kokocinski, F., Aken, B.L., Barrell, D., Zadissa, A., Searle, S., Barnes, I., Bignell, A., Boychenko, V., Hunt, T., Kay, M., Mukherjee, G., Rajan, J., Despacio‐Reyes, G., Saunders, G., Steward, C., Harte, R., Lin, M., Howald, C., Tanzer, A., Derrien, T., Chrast, J., Walters, N., Balasubramanian, S., Pei, B., Tress, M., Rodriguez, J.M., Ezkurdia, I., van Baren, J., Brent, M., Haussler, D., Kellis, M., Valencia, A., Reymond, A., Gerstein, M., Guigo, R., and Hubbard, T.J. 2012. GENCODE: The reference human genome annotation for The ENCODE Project. Genome Res. 22:1760‐1774. doi: 10.1101/gr.135350.111.
  Henderson, R., Sali, A., Baker, M.L., Carragher, B., Devkota, B., Downing, K.H., Egelman, E.H., Feng, Z., Frank, J., Grigorieff, N., Jiang, W., Ludtke, S.J., Medalia, O., Penczek, P.A., Rosenthal, P.B., Rossmann, M.G., Schmid, M.F., Schroder, G.F., Steven, A.C., Stokes, D.L., Westbrook, J.D., Wriggers, W., Yang, H., Young, J., Berman, H.M., Chiu, W., Kleywegt, G.J., and Lawson, C.L. 2012. Outcome of the first electron microscopy validation task force meeting. Structure 20:205‐214. doi: 10.1016/j.str.2011.12.014.
  Herrero, J., Muffato, M., Beal, K., Fitzgerald, S., Gordon, L., Pignatelli, M., Vilella, A.J., Searle, S.M., Amode, R., Brent, S., Spooner, W., Kulesha, E., Yates, A., and Flicek, P. 2016. Ensembl comparative genomics resources. Database (Oxford) 2016:bav096. doi: 10.1093/database/bav096.
  Kabsch, W. and Sander, C. 1983. Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features. Biopolymers 22:2577‐2637. doi: 10.1002/bip.360221211.
  Kempf, D.J., Marsh, K.C., Denissen, J.F., McDonald, E., Vasavanonda, S., Flentge, C.A., Green, B.E., Fino, L., Park, C.H., and Kong, X.P. 1995. ABT‐538 is a potent inhibitor of human immunodeficiency virus protease and has high oral bioavailability in humans. Proc. Natl. Acad. Sci. U.S.A. 92:2484‐2488. doi: 10.1073/pnas.92.7.2484.
  Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., and Haussler, D. 2002. The human genome browser at UCSC. Genome Res. 12:996‐1006. doi: 10.1101/gr.229102.
  Kinjo, A.R., Suzuki, H., Yamashita, R., Ikegawa, Y., Kudou, T., Igarashi, R., Kengaku, Y., Cho, H., Standley, D.M., Nakagawa, A., and Nakamura, H. 2012. Protein Data Bank Japan (PDBj): Maintaining a structural data archive and resource description framework format. Nucleic Acids Res. 40:D453‐460. doi: 10.1093/nar/gkr811.
  Kovalevsky, A.Y., Liu, F., Leshchenko, S., Ghosh, A.K., Louis, J.M., Harrison, R.W., and Weber, I.T. 2006. Ultra‐high resolution crystal structure of HIV‐1 protease mutant reveals two binding sites for clinical inhibitor TMC114. J. Mol. Biol. 363:161‐173. doi: 10.1016/j.jmb.2006.08.007.
  Ladunga, I. 2009. Finding homologs in amino acid sequences using network BLAST searches. Curr. Protoc. Bioinform. 25:3.4:3.4.1–3.4.34.
  Law, V., Knox, C., Djoumbou, Y., Jewison, T., Guo, A.C., Liu, Y., Maciejewski, A., Arndt, D., Wilson, M., Neveu, V., Tang, A., Gabriel, G., Ly, C., Adamjee, S., Dame, Z.T., Han, B., Zhou, Y., and Wishart, D.S. 2014. DrugBank 4.0: Shedding new light on drug metabolism. Nucleic Acids Res. 42:D1091‐1097. doi: 10.1093/nar/gkt1068.
  Lawson, C.L., Patwardhan, A., Baker, M.L., Hryc, C., Garcia, E.S., Hudson, B.P., Lagerstedt, I., Ludtke, S.J., Pintilie, G., Sala, R., Westbrook, J.D., Berman, H.M., Kleywegt, G.J., and Chiu, W. 2016. EMDataBank unified data resource for 3DEM. Nucleic Acids Res. 44:D396‐403. doi: 10.1093/nar/gkv1126.
  Liu, Z., Li, Y., Han, L., Li, J., Liu, J., Zhao, Z., Nie, W., Liu, Y., and Wang, R. 2015. PDB‐wide collection of binding data: Current status of the PDBbind database. Bioinformatics 31:405‐412. doi: 10.1093/bioinformatics/btu626.
  Montelione, G.T., Nilges, M., Bax, A., Guntert, P., Herrmann, T., Richardson, J.S., Schwieters, C.D., Vranken, W.F., Vuister, G.W., Wishart, D.S., Berman, H.M., Kleywegt, G.J., and Markley, J.L. 2013. Recommendations of the wwPDB NMR Validation Task Force. Structure 21:1563‐1570. doi: 10.1016/j.str.2013.07.021.
  Murzin, A.G., Brenner, S.E., Hubbard, T., and Chothia, C. 1995. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247:536‐540. doi: 10.1016/S0022‐2836(05)80134‐2.
  Myers‐Turnbull, D., Bliven, S.E., Rose, P.W., Aziz, Z.K., Youkharibache, P., Bourne, P.E., and Prlić, A. 2014. Systematic detection of internal symmetry in proteins using CE‐Symm. J. Mol. Biol. 426:2255‐2268. doi: 10.1016/j.jmb.2014.03.010.
  Needleman, S.B. and Wünsch, C.D. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48:443‐453. doi: 10.1016/0022‐2836(70)90057‐4.
  Ortiz, A.R., Strauss, C.E., and Olmea, O. 2002. MAMMOTH (matching molecular models obtained from theory): An automated method for model comparison. Protein Sci. 11:2606‐2621. doi: 10.1110/ps.0215902.
  Pearl, F.M., Martin, N., Bray, J.E., Buchan, D.W., Harrison, A.P., Lee, D., Reeves, G.A., Shepherd, A.J., Sillitoe, I., Todd, A.E., Thornton, J.M., and Orengo, C.A. 2001. A rapid classification protocol for the CATH Domain Database to support structural genomics. Nucleic Acids Res. 29:223‐227. doi: 10.1093/nar/29.1.223.
  Pearson, W.R. and Lipman, D.J. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. U.S.A. 85:2444‐2448. doi: 10.1073/pnas.85.8.2444.
  Prashar, V., Bihani, S.C., Ferrer, J.L., and Hosur, M.V. 2015. Structural basis of why nelfinavir‐resistant D30N mutant of HIV‐1 protease remains susceptible to saquinavir. Chem. Biol. Drug Des. 86:302‐308. doi: 10.1111/cbdd.12494. Epub 2015 Jan 9.
  Prlić, A., Bliven, S., Rose, P.W., Bluhm, W.F., Bizon, C., Godzik, A., and Bourne, P.E. 2010. Pre‐calculated protein structure alignments at the RCSB PDB website. Bioinformatics 26:2983‐2985. doi: 10.1093/bioinformatics/btq572.
  Protein Data Bank. 1971. Protein Data Bank. Nat. New Biol. 233:223.
  Pundir, S., Martin, M.J., O'Donovan, C., and The UniProt Consortium. 2016. UniProt tools. Curr. Protoc. Bioinform. 53:1.29.1‐1.29.15. doi: 10.1002/0471250953.bi0129s53.
  Pundir, S., Magrane, M., Martin, M.J., O'Donovan, C., and The UniProt Consortium. 2015. Searching and navigating UniProt databases. Curr. Protoc. Bioinform. 50:1.27.1‐1.27.10. doi: 10.1002/0471250953.bi0127s50.
  Read, R.J., Adams, P.D., Arendall, W.B. 3rd, Brunger, A.T., Emsley, P., Joosten, R.P., Kleywegt, G.J., Krissinel, E.B., Lutteke, T., Otwinowski, Z., Perrakis, A., Richardson, J.S., Sheffler, W.H., Smith, J.L., Tickle, I.J., Vriend, G., and Zwart, P.H. 2011. A new generation of crystallographic validation tools for the protein data bank. Structure 19:1395‐1412. doi: 10.1016/j.str.2011.08.006.
  Rose, P.W., Prlić, A., Bi, C., Bluhm, W.F., Christie, C.H., Dutta, S., Green, R.K., Goodsell, D.S., Westbrook, J.D., Woo, J., Young, J., Zardecki, C., Berman, H.M., Bourne, P.E., and Burley, S.K. 2015. The RCSB Protein Data Bank: Views of structural biology for basic and applied research and education. Nucleic Acids Res. 43:D345‐356. doi: 10.1093/nar/gku1214.
  Shindyalov, I.N. and Bourne, P.E. 1998. Protein structure alignment by incremental combinatory extension of the optimum path. Protein Eng. 11:739‐747. doi: 10.1093/protein/11.9.739.
  Sillitoe, I., Lewis, T., and Orengo, C. 2015a. Using CATH‐Gene3D to analyze the sequence, structure, and function of proteins. Curr. Protoc. Bioinform. 50:1.28.1‐1.28.21. doi: 10.1002/0471250953.bi0128s50.
  Sillitoe, I., Lewis, T.E., Cuff, A., Das, S., Ashford, P., Dawson, N.L., Furnham, N., Laskowski, R.A., Lee, D., Lees, J.G., Lehtinen, S., Studer, R.A., Thornton, J., and Orengo, C.A. 2015b. CATH: Comprehensive structural and functional annotations for genome sequences. Nucleic Acids Res. 43:D376‐381. doi: 10.1093/nar/gku947.
  Sippl, M.J. and Wiederstein, M. 2008. A note on difficult structure alignment problems. Bioinformatics 24:426‐427. doi: 10.1093/bioinformatics/btm622.
  Smith, T.F. and Waterman, M.S. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147:195‐197. doi: 10.1016/0022‐2836(81)90087‐5.
  Stierand, K. and Rarey, M. 2010. Drawing the PDB: Protein‐ligand complexes in two dimensions. ACS Med. Chem. Lett. 1:540‐545. doi: 10.1021/ml100164p.
  Tan, Q., Zhu, Y., Li, J., Chen, Z., Han, G.W., Kufareva, I., Li, T., Ma, L., Fenalti, G., Li, J., Zhang, W., Xie, X., Yang, H., Jiang, H., Cherezov, V., Liu, H., Stevens, R.C., Zhao, Q., and Wu, B. 2013. Structure of the CCR5 chemokine receptor‐HIV entry inhibitor maraviroc complex. Science 341:1387‐1390. doi: 10.1126/science.1241475.
  Tatusova, T.A. and Madden, T.L. 1999. BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol. Lett. 174:247‐250. doi: 10.1111/j.1574‐6968.1999.tb13575.x.
  The Gene Ontology Consortium. 2000. Gene Ontology: Tool for the unification of biology. Nat. Genetics 25:25‐29. doi: 10.1038/75556.
  Ulrich, E.L., Akutsu, H., Doreleijers, J.F., Harano, Y., Ioannidis, Y.E., Lin, J., Livny, M., Mading, S., Maziuk, D., Miller, Z., Nakatani, E., Schulte, C.F., Tolmie, D.E., Kent Wenger, R., Yao, H., and Markley, J.L. 2008. BioMagResBank. Nucleic Acids Res. 36:D402‐408. doi: 10.1093/nar/gkm957.
  UniProt Consortium. 2015. UniProt: A hub for protein information. Nucleic Acids Res. 43:D204‐212. doi: 10.1093/nar/gku989.
  Velankar, S., Dana, J.M., Jacobsen, J., van Ginkel, G., Gane, P.J., Luo, J., Oldfield, T.J., O'Donovan, C., Martin, M.J., and Kleywegt, G.J. 2013. SIFTS: Structure integration with function, taxonomy and sequences resource. Nucleic Acids Res. 41:D483‐489. doi: 10.1093/nar/gks1258.
  Velankar, S., van Ginkel, G., Alhroub, Y., Battle, G.M., Berrisford, J.M., Conroy, M.J., Dana, J.M., Gore, S.P., Gutmanas, A., Haslam, P., Hendrickx, P.M., Lagerstedt, I., Mir, S., Fernandez Montecelo, M.A., Mukhopadhyay, A., Oldfield, T.J., Patwardhan, A., Sanz‐Garcia, E., Sen, S., Slowley, R.A., Wainwright, M.E., Deshpande, M.S., Iudin, A., Sahni, G., Salavert Torres, J., Hirshberg, M., Mak, L., Nadzirin, N., Armstrong, D.R., Clark, A.R., Smart, O.S., Korir, P.K., and Kleywegt, G.J. 2016. PDBe: Improved accessibility of macromolecular structure data from PDB and EMDB. Nucleic Acids Res. 44:D385‐395. doi: 10.1093/nar/gkv1047.
  Weininger, D. 1988. SMILES 1. Introduction and encoding rules. J. Chem. Inf. Comput. Sci. 28:31‐36. doi: 10.1021/ci00057a005.
  Westbrook, J.D. and Fitzgerald, P.M.D. 2009. Chapter 10 the PDB format, mmCIF formats, and other data formats. In Structural Bioinformatics, 2nd ed. (P.E. Bourne and J. Gu, eds.) pp. 271‐291. John Wiley & Sons, Inc., Hoboken, N.J.
  Wishart, D.S. 2007. In silico drug exploration and discovery using DrugBank. Curr. Protoc. Bioinform. 18:14.4:14.4.1–14.4.32. doi: 10.1002/0471250953.bi1404s18.
  Ye, Y. and Godzik, A. 2004. FATCAT: A web server for flexible structure comparison and structure similarity searching. Nucleic Acids Res. 32:W582‐585. doi: 10.1093/nar/gkh430.
  Zhang, Y. and Skolnick, J. 2005. TM‐align: A protein structure alignment algorithm based on the TM‐score. Nucleic Acids Res. 33:2302‐2309. doi: 10.1093/nar/gki524.
PDF or HTML at Wiley Online Library