
Using N‐SCAN or TWINSCAN to Predict Gene Structures in Genomic DNA Sequences
Abstract
N-SCAN is a gene-prediction system that combines the methods of ab initio predictors like GENSCAN with information derived from genome comparison. It is the latest in the TWINSCAN series of programs. This unit describes the use of N-SCAN to identify gene structures in eukaryotic genomic sequences. Protocols for using N-SCAN through its Web interface and from the command line in a Linux environment are provided. Detailed discussion about the appropriate parameter settings, input-sequence processing, and choice of genome for comparison are included. Curr. Protoc. Bioinform. 20:4.8.1-4.8.16. © 2007 by John Wiley & Sons, Inc.
Keywords: N-SCAN; TWINSCAN; gene prediction; sequence alignment; comparative genome analysis; cross-species sequence comparison ; genome annotation
Table of Contents
- Introduction
- Basic Protocol: Using the N-SCAN Web Server
- Run N-Scan from the Command Line on a Local Computer
- Alternate Protocol 1: Preparing Data Files and Running N-SCAN Manually
- Alternate Protocol 2: Using Nscan_Driver.Pl on a Local Computer
- Support Protocol: Obtaining and Installing N-Scan on a Local Computer
- Guidelines for Understanding Results
- Commentary
- Literature Cited
- Figures
Figures
-

Figure 4.8.1 The N-SCAN Web server at http://mblab.wustl.edu/nscan. If you are not registered, a Register link will be visible in the top right corner. -

Figure 4.8.2 The Submission page that appears when a sequence has been submitted to the Web server. The current status of the job is explained at the bottom. -

Figure 4.8.3 An example of the top portion of the Submission Results Web page that automatically replaces the waiting page when N-SCAN has completed processing of a sequence submitted to the server. -

Figure 4.8.4 An example of the bottom portion of the Submission Results Web page. -

Figure 4.8.5 An example of a My Submissions page. From here, all previous and running jobs can be accessed using the links on the left.
Videos
Literature Cited
| Literature Cited | |
| Alexandersson, M., Cawley, S., and Pachter, L. 2003. SLAM-Cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Res. 13:496-502. | |
| Allen, J.E. and Salzberg, S.L. 2005. JIGSAW: Integration of multiple sources of evidence for gene prediction. Bioinformatics 21:3596-3603. | |
| Allen, J.E., Pertea, M., and Salzberg, S.L. 2004. Computational gene prediction using multiple sources of evidence. Genome Res. 14:142-148. | |
| Brown, R.H., Gross, S.S., and Brent, M.R. 2005. Begin at the beginning: Predicting genes with 5¢ UTRs. Genome Res. 15:742-747. | |
| Burge, C. 1997. Identification of Genes in Human Genomic DNA. In Stanford Univeristy. Stanford University, Stanford, Calif. | |
| Burge, C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268:78-94. | |
| Elsik, C.G., Mackey, A.J., Reese, J.T., Milshina, N.V., Roos, D.S., and Weinstock, G.M. 2007. Creating a honey bee consensus gene set. Genome Biol. 8:R13. | |
| Flicek, P., Keibler, E., Hu, P., Korf, I., and Brent, M.R. 2003. Leveraging the mouse genome for gene prediction in human: From whole-genome shotgun reads to a global synteny map. Genome Res. 13:46-54. | |
| Gross, S.S. and Brent, M.R. 2006. Using multiple alignments to improve gene prediction. J. Comput. Biol. 13:379-393. | |
| Guigo, R., Agarwal, P., Abril, J.F., Burset, M., and Fickett, J.W. 2000. An assessment of gene prediction accuracy in large DNA sequences. Genome Res. 10:1631-1642. | |
| Guigo, R., Dermitzakis, E.T., Agarwal, P., Ponting, C.P., Parra, G., Reymond, A., Abril, J.F., Keibler, E., Lyle, R., Ucla, C., Antonarakis, S.E., and Brent, M.R. 2003. Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes. Proc. Natl. Acad. Sci. U.S.A. 100:1140-1145. | |
| Guigo, R., Flicek, P., Abril, J.F., Reymond, A., Lagarde, J., Denoeud, F., Antonarakis, S., Ashburner, M., Bajic, V.B., Birney, E., Castelo, R., Eyras, E., Ucla, C., Gingeras, T.R., Harrow, J., Hubbard, T., Lewis, S.E., and Reese, M.G. 2006. EGASP: The human ENCODE Genome Annotation Assessment Project. Genome Biol. 7:S21-S31. | |
| Hubbard, T.J., Aken, B.L., Beal, K., Ballester, B., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cunningham, F., Cutts, T., Down, T., Dyer, S.C., Fitzgerald, S., Fernandez-Banet, J., Graf, S., Haider, S., Hammond, M., Herrero, J., Holland, R., Howe, K., Howe, K., Johnson, N., Kahari, A., Keefe, D., Kokocinski, F., Kulesha, E., Lawson, D., Longden, I., Melsopp, C., Megy, K., Meidl, P., Ouverdin, B., Parker, A., Prlic, A., Rice, S., Rios, D., Schuster, M., Sealy, I., Severin, J., Slater, G., Smedley, D., Spudich, G., Trevanion, S., Vilella, A., Vogel, J., White, S., Wood, M., Cox, T., Curwen, V., Durbin, R., Fernandez-Suarez, X.M., Flicek, P., Kasprzyk, A., Proctor, G., Searle, S., Smith, J., Ureta-Vidal, A., and Birney, E. 2007. Ensembl 2007. Nucleic Acids Res. 35:D610-D617. | |
| Keibler, E. and Brent, M.R. 2003. Eval: A software package for analysis of genome annotations. BMC Bioinformatics 4:50. | |
| Korf, I., Flicek, P., Duan, D., and Brent, M.R. 2001. Integrating genomic homology into gene structure prediction. Bioinformatics 17:S140-S148. | |
| Mouse Genome Sequencing Consortium et al.. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520-562. | |
| Parra, G., Agarwal, P., Abril, J.F., Wiehe, T., Fickett, J.W., and Guigo, R. 2003. Comparative gene prediction in human and mouse. Genome Res. 13:108-117. | |
| Salamov, A.A. and Solovyev, V.V. 2000. Ab initio gene finding in Drosophila genomic DNA. Genome Res. 10:516-522. | |
| Stanke, M. and Waack, S. 2003. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19:II215-II225. | |
| Stanke, M., Tzvetkova, A., and Morgenstern, B. 2006. AUGUSTUS at EGASP: Using EST, protein and genomic alignments for improved gene prediction in the human genome. Genome Biol. 7:S11.1- S11.8. | |
| van Baren, M.J. and Brent, M.R. 2006. Iterative gene prediction and pseudogene removal improves genome annotation. Genome Res. 16:678-685. | |
| Wei, C. and Brent, M.R. 2006. Using ESTs to improve the accuracy of de novo gene prediction. BMC Bioinformatics 7:327. | |



