Protein Tertiary Structure Prediction

Dong Xu1, Ying Xu1

1 Oak Ridge National Laboratory, Oak Ridge
Publication Name:  Current Protocols in Protein Science
Unit Number:  Unit 2.7
DOI:  10.1002/0471140864.ps0207s19
Online Posting Date:  May, 2001
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


This unit addresses how to predict the tertiary structure of a protein from its amino acid sequence using computational methods. Three types of prediction methods‐‐homology modeling, fold recognition, and ab initio prediction‐‐are introduced.

PDF or HTML at Wiley Online Library

Table of Contents

  • Homology Modeling
  • Sequence Profile Methods
  • Threading
  • Ab Initio Prediction
  • Commentary
  • Literature Cited
  • Figures
  • Tables
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
   Alexandrov, N.N., Nussinov, R., and Zimmer, R.M. 1996. Fast protein fold recognition via sequence to structure alignment and contact capacity potentials. In Biocomputing: Proceedings of the 1996 Pacific Symposium (L. Hunter and T. Klein, eds.) pp. 53‐72. World Scientific Publishing, Singapore.
   Al‐Karadaghi, S., Hansson, M., Nikonov, S., Jonsson, B., and Hederstedt, L. 1997. Crystal structure of ferrochelatase: The terminal enzyme in heme biosynthesis. Structure 5:1501‐1510.
   Altschul, S.F. and Koonin, E.V. 1998. Iterated profile searches with PSI‐BLAST—a tool for discovery in protein databases. Trends Biochem. Sci. 23:444‐447.
   Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI‐BLAST: A new generation of protein database search programs. Nucl. Acids Res. 25:3389‐3402.
   Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., and Tasumi, M. 1977. The protein data bank: A computer based archival file for macromolecular structures. J. Mol. Biol. 112:535‐542.
   Bowie, J.U., Luthy, R., and Eisenberg, D. 1991. A method to identify protein sequences that fold into a known three‐dimensional structure. Science 253:164‐170.
   Bryant, S.H. and Altschul, S.F. 1995. Statistics of sequence‐structure threading. Curr. Opinion Struct. Biol. 5:236‐244.
   Bryant, S.H. and Lawrence, C.E. 1993. An empirical energy function for threading protein sequence through the folding motif. Proteins Struct. Funct. Genet. 16:92‐112.
   CASP (Critical Assessment of Techniques for Protein Structure Prediction). 1995. Protein structure prediction issue. Proteins Struct. Funct. Genet. 23:295‐462.
   CASP. 1997. Protein structure prediction issue. Proteins Struct. Funct. Genet. 29:1‐230.
   CASP. 1999. Protein structure prediction issue. Proteins Struct. Funct. Genet. 37:1‐237.
   Crawford, O.H. 1999. A fast, stochastic algorithm for protein threading. Bioinformatics 15:66‐71.
   Fischer, D. and Eisenberg, D. 1996. Fold recognition using sequence‐derived predictions. Protein Sci. 5:947‐955.
   Fischer, D., Rice, D., Bowie, J.U., and Eisenberg, D. 1996. Assigning amino acid sequences to 3‐dimensional protein folds. FASEB. J. 10:126‐136.
   Gerstein, M. 1998. Patterns of protein‐fold usage in eight microbial genomes: A comprehensive structural census. Proteins Struct. Funct. Genet. 33:518‐534.
   Godzik, A., Skolnick, J., and Kolinski, A. 1992. A topology fingerprint approach to the inverse folding problem. J. Mol. Biol. 227:227‐238.
   Heniko, S. and Heniko, J.G. 1994. Protein family classification based on searching a database of blocks. Genomics 19:97‐107.
   Holm, L. and Sander, C. 1996. Mapping the protein universe. Science 273:595‐602.
   Holm, L. and Sander, C. 1998. Dictionary of recurrent domains in protein structures. Proteins Struct. Funct. Genet. 33:88‐96.
   Hu, X., Xu, D., Hamer, K., Schulten, K., Koepke, J., and Michel, H. 1995. Predicting the structure of the light‐harvesting complex II of Rhodospirillum molischianum. Protein Sci. 4:1670‐1682.
   Hughey, R. 1996. Parallel hardware for sequence comparison and alignment. CABIOS 12:473‐479.
   Hughey, R., and Krogh, A. 1996. Hidden Markov models for sequence analysis: Extension and analysis of the basic method. CABIOS 12:95‐107.
   Humphrey, W.F., Dalke, A., and Schulten, K. 1996. VMD—visual molecular dynamics. J. Mol. Graphics 14:33‐38.
   Jones, D.T. 1999. GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol. 287:797‐815.
   Jones, D.T., Taylor, W.R., and Thornton, J.M. 1992. A new approach to protein fold recognition. Nature 358:86‐89.
   Karplus, K., Barrett, C., and Hughey, R. 1998. Hidden Markov models for detecting remote protein homologies. Bioinformatics 14:846‐856.
   Krogh, A., Brown, M., Mian, I.S., Sjolander, K., and Haussler, D. 1994. Hidden Markov models in computational biology: Applications to protein modeling. J. Mol. Biol. 235:1501‐1531.
   Laskowski, R.A., MacArthur, M.W., Moss, D.S., and Thornton, J.M. 1993. PROCHECK: A program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26:283‐291.
   Lund, O., Frimand, K., Gorodkin, J., Bohr, H., Bohr, J., Hansen, J., and Brunak, S. 1997. Protein distance constraints predicted by neural networks and probability density functions. Protein Eng. 10:1241‐1248.
   Madej, T., Gibrat, J.F., and Bryant, S.H. 1995. Threading analysis suggests that the obese gene product may be a helical cytokine. FEBS Lett. 373:13‐18.
   Milburn, D., Laskowski, R.A., and Thornton, J.M. 1998. Sequences annotated by structure: A tool to facilitate the use of structural information in sequence analysis. Protein Eng. 11:855‐859.
   Molecular Simulations. 1998. Insight II (Release 98.0). Molecular Simultions San Diego, Calif.
   Murzin, A.G., Brenner, S.E., Hubbard, T., and Chothia, C. 1995. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247:536‐540.
   Nilges, M. and Brünger, A. 1993. Successful prediction of the coiled coil geometry of the gcn4 leucine zipper domain by simulated annealing: Comparison to the x‐ray structure. Proteins Struct. Funct. Genet. 15:133‐146.
   Park, J., Karplus, K., Barrett, C., Hughey, R., Haussler, D., Hubbard, T., and Chothia, C. 1998. Sequence comparisons using multiple sequences detect twice as many remote homologues as pairwise methods. J. Mol. Biol. 284:1201‐1210.
   Pearson, W.R. and Lipman, D.J. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. U.S.A. 85:2444‐2448.
   Peitsch, M.C. 1996. ProMod and Swiss‐Model: Internet‐based tools for automated comparative protein modelling. Biochem. Soc. Trans. 24:274‐279.
   Pontius, J., Richelle, J., and Wodak, S.J. 1996. Quality assessment of protein 3D structures using standard atomic volumes. J. Mol. Biol. 264:121‐136.
   Rost, B. 1995. TOPITS: Threading one‐dimensional predictions into three‐dimensional structures. ISMB 3:314‐321.
   Rost, B. and Sander, C. 1993. Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232:584‐599.
   Sali, A. and Blundell, T.L. 1993. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234:779‐815.
   Sanchez, R. and Sali, A. 1998. Large‐scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc. Natl. Acad. Sci. U.S.A. 95:13597‐13602.
   Simons, K.T., Kooperberg, C., Huang, E., and Baker, D. 1997. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268:209‐225.
   Sippl, M.J. and Weitckus, S. 1992. Detection of native‐like models for amino acid sequences of unknown three‐dimensional structure in a database of known protein conformations. Proteins Struct. Funct. Genet. 13:258‐271.
   Skolnick, J. and Kolinski, A. 1991. Dynamic monte carlo simulations of a new lattice model of globular protein folding, structure and dynamics. J. Mol. Biol. 221:499‐531.
   Smith, T.F. and Waterman, M.S. 1981. Comparison of biosequences. Adv. Appl. Math. 2:482‐489.
   Smith, T., Conte, L.L., Bienkowska, J., Gaitatzes, C., Rogers, R., and Lathrop, R. 1997. Current limitations to protein threading approaches. J. Comp. Biol. 4:217‐225.
   Srinivasan, B.N. and Blundell, T.L. 1993. An evaluation of the performance of an automated procedure for comparative modelling of protein tertiary structure. Protein Eng. 6:501‐512.
   Srinivasan, R. and Rose, G. 1995. LINUS—a hierarchic procedure to predict the fold of a protein. Proteins Struct. Funct. Genet. 22:81‐99.
   Tripos. 1999. SYBYL 6.5.3. Tripos, Inc., St. Louis.
   Unger, R. and Moult, J. 1992. Potential of genetic algorithms in protein folding and protein engineering simulations. J. Mol. Biol. 5:637‐645.
   Vriend, G. 1990. WHAT IF: A molecular modelling and drug design program. J. Mol. Graphics 8:52‐56.
   Wang, Z.X. 1998. A re‐estimation for the total numbers of protein folds and superfamilies. Protein Eng. 11:621‐626.
   Xu, Y., Xu, D., and Uberbacher, E.C. 1998. An efficient computational method for globally optimal threading. J. Comp. Biol. 5:597‐614.
   Xu, Y., Xu, D., Crawford, O.H., Einstein, J.R., Larimer, F., Uberbacher, E.C., Unseren, M.A., and Zhang, G. 1999. Protein threading by PROSPECT: A prediction experiment in CASP3. Protein Eng. 12:101‐109.
PDF or HTML at Wiley Online Library