Using MZEF to Find Internal Coding Exons

Micheal Q. Zhang1

1 Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 4.2
DOI:  10.1002/0471250953.bi0402s00
Online Posting Date:  August, 2002
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

MZEF (Michael Zhang's Exon Finder) was designed to help identify one of the most important classes of exons, i.e. the internal coding exons, in human genomic DNA sequences. It is neither for predicting intronless genes, nor for assembling predicted exons into complete gene models. There is also a mouse version (mMZEF) and an Arabidopsis version (aMZEF). This unit presents the Unix and Web versions of MZEF and reviews how to interpret the MZEF results.

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Basic Protocol 1: Using MZEF to Analyze Genomic DNA Sequences Via the Web Interface
  • Basic Protocol 2: Using the Command‐Line Unix Version of MZEF to Analyze Genomic DNA Sequences
  • Alternate Protocol 1: Using the Interactive Unix Version MZEF to Analyze Genomic DNA Sequences
  • Guidelines for Understanding Results
  • Commentary
  • Appendix
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

   Bishop, C.M. 1996. Neural Networks for Pattern Recognition. Oxford, Clarendon Press.
   Box, G.E.P. and Cox, D.R. 1964. An analysis of transformations. J. R. Statist. Soc. B 26:211‐252.
   Chen, T. and Zhang, M.Q. 1998. POMBE: A fission yeast gene‐finding and exon‐intron structure prediction system. Yeast 14:701‐710.
   Davuluri, R., Grosse, I., and Zhang, M.Q. 2001. Computational identification of promoters and first exons in the human genome. Nature Genet. 29:412‐417.
   Fisher, R.A. 1936. The use of multiple measurements in taxonomic problems. Ann. Eugen. 7:179‐188.
   Fukunaga, K. 1990. Introduction to Statistical Pattern Recognition 2nd Edition. Academic Press, San Diego.
   International Human Genome Sequencing Consortium 2001. Initial sequencing and analysis of the human genome. Nature 409:860‐921.
   Ioshikhes, I. and Zhang, M.Q. 2000. Large‐scale human promoter mapping using CpG islands discrimination. Nature Genet. 26:61‐63.
   Minghetti, P.P., Ruffner, D.E., Kuang, W.J., Dennison, O.E., Hawkins, J.W., Beattie, W.G., and Dugaiczyk, A. 1986. Molecular structure of the human albumin gene is revealed by nucleotide sequence within q11‐22 of chromosome 4. J. Biol. Chem. 261:6747‐6757.
   Modrek, B. and Lee, C.A. 2002. A genomic view of alternative splicing. Nat. Genet. 30:13‐19.
   Solovyev, V.V., Salamov, A.A., and Lawrence, C.B. 1994. Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames. Nucl. Acids Res. 22:5156‐5163.
   Tabaska, J.E. and Zhang, M.Q. 1999. Detection of polyadenylation signals in human DNA sequences. Gene 231:77‐86.
   Tabaska, J.E., Davuluri, R., and Zhang, M.Q. 2001. A novel 3′‐terminal exon recognition algorithm. Bioinformatics 17:602‐607.
   Thanaraj, T.A. and Robinson, A.J. 2000. Prediction of exact boundaries of exons. Briefings in Bioinformatics 1:34356.
   Zhang, M.Q. 1997. Identification of protein coding regions in the human genome by quadratic discriminant analysis. Proc. Natl. Acad. Sci. U.S.A. 94:565‐568.
   Zhang, M.Q. 1998a. Identification of protein‐coding regions in Arabidopsis thaliana genome based on quadratic discriminant analysis. Plant Mol. Biol. 37:803‐806.
   Zhang, M.Q. 1998b. Identification of human gene core‐promoters in silico. Genome Res. 8:319‐326.
   Zhang, M.Q. 1998c. Statistical features of human exons and their flanking regions. Hum. Mol. Genet. 7:919‐932.
   Zhang, M.Q. 2000. Discriminant analysis and its application in DNA sequence motif recognition. Briefings in Bioinformatics 1:331‐342.
Key References
   Zhang, 1997. See above.
  This is the original MZEF paper.
   Zhang, 1998c. See above.
  This has human exon classification and feature statistics.
   Zhang, 2000. See above.
  This is a tutorial on discriminant analysis and has examples on how to combine MZEF with other programs.
Internet Resources
   http://www.cshl.org/genefinder
  MZEF Web server
   http://www.cshl.org/mzhanglab
  Papers and other related information for MZEF
   ftp://cshl.org/pub/science/mzhanglab
  FTP site for MZEF
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library