Eukaryotic Gene Prediction Using GeneMark.hmm‐E and GeneMark‐ES

Mark Borodovsky1, Alex Lomsadze1

1 Georgia Institute of Technology, Atlanta, Georgia
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 4.6
DOI:  10.1002/0471250953.bi0406s35
Online Posting Date:  September, 2011
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

This unit describes how to use the gene‐finding programs GeneMark.hmm‐E and GeneMark‐ES for finding protein‐coding genes in the genomic DNA of eukaryotic organisms. These bioinformatics tools have been demonstrated to have state‐of‐the‐art accuracy for many fungal, plant, and animal genomes, and have frequently been used for gene annotation in novel genomic sequences. An additional advantage of GeneMark‐ES is that the problem of algorithm parameterization is solved automatically, with parameters estimated by iterative self‐training (unsupervised training). Curr. Protoc. Bioinform. 35:4.6.1‐4.6.10. © 2011 by John Wiley & Sons, Inc.

Keywords: gene finding; hidden Markov model; unsupervised parameter estimation

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Using GeneMark.hmm for Eukaryotic Gene Prediction via Web Interface
  • Alternate Protocol 1: Using the Unix Version of GeneMark.hmm‐E
  • Basic Protocol 2: Using GeneMark‐ES for Eukaryotic Gene Prediction
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

   Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403‐410.
   Borodovsky, M. and McIninch, J. 1993. GeneMark: Parallel gene recognition for both DNA strands. Comput. Chem. 17:123‐133.
   Lukashin, A.V. and Borodovsky, M. 1998. GeneMark.hmm: New solutions for gene finding. Nucleic Acids Res. 26:1107hyphen;1115.
   Lomsadze, A., Ter‐Hovhannisyan, V., Chernoff, Y., and Borodovsky, M. 2005. Gene identification in novel eukaryotic genomes by self‐training algorithm. Nucleic Acids Res. 33:6494‐6506.
   Rabiner, L.R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the I.E.E.E. 77:257‐286.
   Ter‐Hovhannisyan, V., Lomsadze, A., Chernoff, Y., and Borodovsky, M. 2008. Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Res. 18:1979‐1990.
Internet Resources
  http://topaz.gatech.edu/GeneMark
  GeneMark Web site.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library