RNA‐Seq Read Alignments with PALMapper

Géraldine Jean1, André Kahles1, Vipin T. Sreedharan1, Fabio De Bona1, Gunnar Rätsch1

1 Friedrich Miescher Laboratory, Max Planck Society, Tübingen, Germany
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 11.6
DOI:  10.1002/0471250953.bi1106s32
Online Posting Date:  December, 2010
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Next‐generation sequencing technologies have revolutionized genome and transcriptome sequencing. RNA‐Seq experiments are able to generate huge amounts of transcriptome sequence reads at a fraction of the cost of Sanger sequencing. Reads produced by these technologies are relatively short and error prone. To utilize such reads for transcriptome reconstruction and gene‐structure identification, one needs to be able to accurately align the sequence reads over intron boundaries. In this unit, we describe PALMapper, a fast and easy‐to‐use tool that is designed to accurately compute both unspliced and spliced alignments for millions of RNA‐Seq reads. It combines the efficient read mapper GenomeMapper with the spliced aligner QPALMA, which exploits read‐quality information and predictions of splice sites to improve the alignment accuracy. The PALMapper package is available as a command‐line tool running on Unix or Mac OS X systems or through a Web interface based on Galaxy tools.Curr. Protoc. Bioinform. 32:11.6.1‐11.6.37. © 2010 by John Wiley & Sons, Inc.

Keywords: RNA‐Seq; sequence alignment; splice‐site prediction; PALMapper; QPALMA; GenomeMapper; Galaxy

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Training QPALMA on the Command Line
  • Support Protocol 1: Installing Software
  • Alternate Protocol 1: Training QPALMA Using Galaxy Tools
  • Basic Protocol 2: Generating Alignments with PALMapper on the Command Line
  • Support Protocol 2: Predicting Splice Sites with mGene Using Galaxy Tools
  • Alternate Protocol 2: Generating Alignments with PALMapper Using Galaxy Tools
  • Basic Protocol 3: Evaluating Alignments on the Command Line
  • Alternate Protocol 3: Evaluating Alignments Using Galaxy Tools
  • Basic Protocol 4: Visualizing Results on GBrowse
  • Alternate Protocol 4: Visualizing Results on Galaxy Trackster
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

   Blankenberg, D., Taylor, J., Schenck, I., He, J., Zhang, Y., Ghent, M., Veeraraghavan, N., Albert, I., Miller, W., Makova, K., Hardison, R., and Nekrutenko, A. 2007. A framework for collaborative analysis of ENCODE data: Making large‐scale analyses biologist‐friendly. Genome Res. 17:960‐964.
   Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko, A., and Taylor, J. 2010. Galaxy: A Web‐based genome analysis tool for experimentalists. Curr. Protoc. Mol. Biol. 89:19.10.1‐19.10.21.
   De Bona, F., Ossowski, S., Schneeberger, K., and Rätsch, G. 2008. Optimal spliced alignments of short sequence reads. Bioinformatics 24:i174‐i180.
   Hillier, L.W., Reinke, V., Green, P., Hirst, M., Marra, M.A., and Waterston, R.H. 2009. Massively parallel sequencing of the polyadenylated transcriptome of C. elegans. Genome Res. 19:657‐666.
   Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and 1000 Genome Project Data Processing Subgroup 2009. The sequence alignment/map (sam) format and samtools. Bioinformatics 25:2078‐2079.
   Schneeberger, K., Hagmann, J., Ossowski, S., Warthmann, N., Gesing, S., Kohlbacher, O., and Weigel, D. 2009. Simultaneous alignment of short reads against multiple genomes. Genome Biol. 10:R98.
   Schweikert, G., Zeller, G., Zien, A., Behr, J., Ong, C.‐S., Philips, P., Bohlen, A., Sonnenburg, S., and Rätsch, G. 2009a. mGene: A novel discriminative gene finding system. Genome Res. 19:2133‐2143.
   Schweikert, G., Behr, J., Zien, A., Zeller, G., Sonnenburg, S., and Rätsch, G. 2009b. mGene.web: A web service for accurate computational gene finding. Nucleic Acids Res. 37:W312‐W316.
   Sonnenburg, S., Schweikert, G., Philips, P., Behr, J., and Rätsch, G. 2007. Accurate splice site prediction using support vector machines. BMC Bioinformatics 10:S7.
   Trapnell, C., Pachter, L., and Salzberg, S.L. 2009. TopHat: Discovering splice junctions with RNA‐Seq. Bioinformatics 25:1105‐1111.
Internet Resources
  http://www.fml.mpg.de/raetsch/suppl/palmapper/tutorial
  Supporting Web page for tutorial on the material in this unit .
  http://www.fml.mpg.de/raetsch/suppl/palmapper
  PALMapper project Web page.
  http://www.fml.mpg.de/raetsch/suppl/qpalma
  QPALMA project Web page.
  http://www.fml.mpg.de/raetsch/suppl/mgene
  mGene project Web page.
  http://www.fml.mpg.de/raetsch/suppl/splice
  ASP project Web page.
  http://ftp.tuebingen.mpg.de/pub/fml/raetsch‐lab/software/
  http server for downloading PALMapper, QPALMA, mGene, and ASP.
  http://galaxy.fml.mpg.de/
  Galaxy server.
  http://www.sanger.ac.uk/Software/formats/GFF/GFF_Spec.shtml
  General Feature Format (GFF) specification: Get detailed information about the GFF and download scripts for converting various computational analyses to GFF format.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library