Computational Prediction of Protein Secondary Structure from Sequence

Fanchi Meng1, Lukasz Kurgan2

1 Department of Electrical and Computer Engineering, University of Alberta, Edmonton, 2 Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia
Publication Name:  Current Protocols in Protein Science
Unit Number:  Unit 2.3
DOI:  10.1002/cpps.19
Online Posting Date:  November, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Secondary structure of proteins refers to local and repetitive conformations, such as α‐helices and β‐strands, which occur in protein structures. Computational prediction of secondary structure from protein sequences has a long history with three generations of predictive methods. This unit summarizes several recent third‐generation predictors. We discuss their inputs and outputs, availability, and predictive performance and explain how to perform and interpret their predictions. We cover methods for the prediction of the 3‐class secondary structure states (helix, strand, and coil) as well as the 8‐class secondary structure states. Recent empirical assessments and our small‐scale analysis reveal that these predictions are characterized by high levels of accuracy, between 70% and 80%. We emphasize that modern predictors are available to end users in the form of convenient‐to‐use Web servers and stand‐alone software. © 2016 by John Wiley & Sons, Inc.

Keywords: coil; DSSP; helix; secondary structure of proteins; strand; prediction

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Prediction of Secondary Structure from Sequence
  • Summary
  • Acknowledgements
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

  Adamczak, R., Porollo, A., and Meller, J. 2005. Combining prediction of secondary structure and solvent accessibility in proteins. Proteins Struct. Funct. Bioinform. 59:467‐475. doi: 10.1002/prot.20441.
  Anfinsen, C.B. 1973. Principles that govern the folding of protein chains. Science 181:223‐230. doi: 10.1126/science.181.4096.223.
  Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. 2000. The Protein Data Bank. Nucl. Acids Res. 28:235‐242. doi: 10.1093/nar/28.1.235.
  Buchan, D.W.A., Minneci, F., Nugent, T.C.O., Bryson, K., and Jones, D.T. 2013. Scalable web services for the PSIPRED Protein Analysis Workbench. Nucl. Acids Res. 41:W349‐W357. doi: 10.1093/nar/gkt381.
  Chen, K. and Kurgan, L. 2013. Computational prediction of secondary and supersecondary structures. Methods Mol. Biol. 932:63‐86. doi: 10.1007/978‐1‐62703‐065‐6_5.
  Drozdetskiy, A., Cole, C., Procter, J., and Barton, G.J. 2015. JPred4: A protein secondary structure prediction server. Nucl. Acids Res. 43:W389‐W394. doi: 10.1093/nar/gkv332.
  Faraggi, E., Zhang, T., Yang, Y., Kurgan, L., and Zhou, Y. 2012. SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles. J. Comput. Chem. 33:259‐267. doi: 10.1002/jcc.21968.
  Guzzo, A. 1965. The influence of amino‐acid sequence on protein structure. Biophys. J. 5:809‐822. doi: 10.1016/S0006‐3495(65)86753‐4.
  Heffernan, R., Paliwal, K., Lyons, J., Dehzangi, A., Sharma, A., Wang, J., Sattar, A., Yang, Y., and Zhou, Y. 2015. Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Sci. Rep. 5:11476. doi: 10.1038/srep11476.
  Jones, D.T. 1999. Protein secondary structure prediction based on position‐specific scoring matrices1. J. Mol. Biol. 292:195‐202. doi: 10.1006/jmbi.1999.3091.
  Joosten, R.P., te Beek, T.A.H., Krieger, E., Hekkelman, M.L., Hooft, R.W.W., Schneider, R., Sander, C., and Vriend, G. 2011. A series of PDB related databases for everyday needs. Nucl. Acids Res. 39:D411‐D419. doi: 10.1093/nar/gkq1105.
  Kabsch, W. and Sander, C. 1983. Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features. Biopolymers 22:2577‐2637. doi: 10.1002/bip.360221211.
  Källberg, M., Wang, H., Wang, S., Peng, J., Wang, Z., Lu, H., and Xu, J. 2012. Template‐based protein structure modeling using the RaptorX web server. Nat. Protoc. 7:1511‐1522. doi: 10.1038/nprot.2012.085.
  Kihara, D. 2005. The effect of long‐range interactions on the secondary structure formation of proteins. Protein Sci. 14:1955‐1963. doi: 10.1110/ps.051479505.
  Koh, I.Y.Y., Eyrich, V.A., Marti‐Renom, M.A., Przybylski, D., Madhusudhan, M.S., Eswar, N., Graña, O., Pazos, F., Valencia, A., Sali, A., and Rost, B. 2003. EVA: Evaluation of protein structure prediction servers. Nucl. Acids Res. 31:3311‐3315. doi: 10.1093/nar/gkg619.
  Kurgan, L. and Disfani, F.M. 2011. Structural protein descriptors in 1‐dimension and their sequence‐based predictions. Curr. Protein Pept. Sci. 12:470‐489. doi: 10.2174/138920311796957711.
  Levitt, M. and Greer, J. 1977. Automatic identification of secondary structure in globular proteins. J. Mol. Biol. 114:181‐239. doi: 10.1016/0022‐2836(77)90207‐8.
  Lin, K., Simossis, V.A., Taylor, W.R., and Heringa, J. 2005. A simple and fast secondary structure prediction method using hidden neural networks. Bioinformatics 21:152‐159. doi: 10.1093/bioinformatics/bth487.
  Magnan, C.N. and Baldi, P. 2014. SSpro/ACCpro 5: Almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics 30:2592‐2597. doi: 10.1093/bioinformatics/btu352.
  Montgomerie, S., Sundararaj, S., Gallin, W.J., and Wishart, D.S. 2006. Improving the accuracy of protein secondary structure prediction using structural alignment. BMC Bioinformatics 7:301. doi: 10.1186/1471‐2105‐7‐301.
  Montgomerie, S., Cruz, J.A., Shrivastava, S., Arndt, D., Berjanskii, M., and Wishart, D.S. 2008. PROTEUS2: A web server for comprehensive protein structure prediction and structure‐based annotation. Nucl. Acids Res. 36:W202‐W209. doi: 10.1093/nar/gkn255.
  Moult, J., Pedersen, J.T., Judson, R., and Fidelis, K. 1995. A large‐scale experiment to assess protein structure prediction methods. Proteins Struct. Funct. Bioinform. 23:ii‐iv. doi: 10.1002/prot.340230303.
  Pauling, L., Corey, R.B., and Branson, H.R. 1951. The structure of proteins: Two hydrogen‐bonded helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci. U.S.A. 37:205‐211. doi: 10.1073/pnas.37.4.205.
  Pollastri, G. and McLysaght, A. 2005. Porter: A new, accurate server for protein secondary structure prediction. Bioinformatics 21:1719‐1720. doi: 10.1093/bioinformatics/bti203.
  Pollastri, G., Przybylski, D., Rost, B., and Baldi, P. 2002. Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins Struct. Funct. Bioinform. 47:228‐235. doi: 10.1002/prot.10082.
  Pruitt, K.D., Tatusova, T., and Maglott, D.R. 2007. NCBI reference sequences (RefSeq): A curated non‐redundant sequence database of genomes, transcripts and proteins. Nucl. Acids Res. 35:D61‐D65. doi: 10.1093/nar/gkl842.
  Rost, B. 2001. Review: Protein secondary structure prediction continues to rise. J. Struct. Biol. 134:204‐218. doi: 10.1006/jsbi.2001.4336.
  Rost, B. 2002. Prediction In 1D: Secondary structure, membrane helices, and accessibility. In Structural Bioinformatics. In Structural Bioinformatics (P. Bourne and H. Weissig, eds.) pp. 559‐588. Wiley, Hoboken, N.J.
  Rost, B. and Sander, C. 1993a. Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc. Natl. Acad Sci. U.S.A. 90:7558‐7562. doi: 10.1073/pnas.90.16.7558.
  Rost, B. and Sander, C. 1993b. Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232:584‐599. doi: 10.1006/jmbi.1993.1413.
  Rost, B. and Sander, C. 1994. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19:55‐72. doi: 10.1002/prot.340190108.
  Rost, B. and Sander, C. 2000. Third generation prediction of secondary structures. In Protein Structure Prediction, vol. 143 (D. Webster, ed.) pp. 71‐95. Humana Press, Totowa, N.J.
  Wang, S., Peng, J., Ma, J.Z., and Xu, J.B. 2016. Protein secondary structure prediction using deep convolutional neural fields. Scientific Rep. 6:18962. doi: 10.1038/srep18962.
  Yachdav, G., Kloppmann, E., Kajan, L., Hecht, M., Goldberg, T., Hamp, T., Hönigschmid, P., Schafferhans, A., Roos, M., Bernhofer, M., Richter, L., Ashkenazy, H., Punta, M., Schlessinger, A., Bromberg, Y., Schneider, R., Vriend, G., Sander, C., Ben‐Tal, N., and Rost, B. 2014. PredictProtein—an open resource for online prediction of protein structural and functional features. Nucl. Acids Res. 42:W337‐W343. doi: 10.1093/nar/gku366.
  Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., and Zhang, Y. 2015. The I‐TASSER Suite: Protein structure and function prediction. Nat. Methods 12:7‐8. doi: 10.1038/nmeth.3213.
  Yaseen, A. and Li, Y. 2014. Context‐based features enhance protein secondary structure prediction accuracy. J. Chem. Inf. Model. 54:992‐1002. doi: 10.1021/ci400647u.
  Zhang, H., Zhang, T., Chen, K., Kedarisetti, K.D., Mizianty, M.J., Bao, Q., Stach, W., and Kurgan, L. 2011. Critical assessment of high‐throughput standalone methods for secondary structure prediction. Brief. Bioinformatics 12:672‐688. doi: 10.1093/bib/bbq088.
Key References
  Chen and Kurgan, 2013. See above.
  Provides description and detailed discussion of key architectural details of a large number of modern predictors of secondary structure.
  Jones, 1999. See above.
  A classic reading that describes the most commonly used PSIPRED method for the prediction of the 3‐class SS.
  Kabsch and Sander, 1983. See above.
  Describes the most commonly used method for the assignment of secondary structure from the tertiary protein structure.
  Magnan and Baldi, 2014. See above.
  Describes SSpro, one of the most popular and accurate methods for the prediction of the 8‐class SS.
  Zhang et al., 2011. See above.
  Provides comprehensive empirical assessment of predictive performance of modern methods for the prediction of secondary structure.
Internet Resources
  http://bioinf.cs.ucl.ac.uk/psipred/
  PSIPRED Web server.
  http://scratch.proteomics.ics.uci.edu/
  SSpro Web server.
  http://zhanglab.ccmb.med.umich.edu/PSSpred/
  PSSpred Web server.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library