Using the Velvet de novo Assembler for Short‐Read Sequencing Technologies

Daniel R. Zerbino1

1 EMBL‐EBI, Wellcome Trust Genome Campus, Cambridge, United Kingdom
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 11.5
DOI:  10.1002/0471250953.bi1105s31
Online Posting Date:  September, 2010
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

The Velvet de novo assembler was designed to build contigs and eventually scaffolds from short‐read sequencing data. This protocol describes how to use Velvet, interpret its output, and tune its parameters for optimal results. It also covers practical issues such as configuration, using the VelvetOptimiser routine, and processing colorspace data. Curr. Protoc. Bioinform. 31:11.5.1‐11.5.12. © 2010 by John Wiley & Sons, Inc.

Keywords: Genome assembly; Next‐Generation Sequencing; de Bruijn Graphs

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Assembling a Set of Reads with Velvet
  • Support Protocol 1: Installing VELVET
  • Support Protocol 2: Using VelvetOptimiser
  • Support Protocol 3: Processing Colorspace Data
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

   Butler, J., MacCallum, I., Kleber, M., Shlyakhter, I.A., Belmonte, M.K., Lander, E..S., Nusbaum, C., and Jaffe, D.B. 2008. ALLPATHS: De novo assembly of whole‐genome shotgun microreads. Genome Res. 18:810‐820.
   Cock, P.J., Fields, C.J., Goto, N., Heuer, M.L., and Rice, P.M. 2010. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 38:1767‐1771.
   Chaisson, M.J., Brinza, D., and Pevzner, P.A. 2009. De novo fragment assembly with short mate‐paired reads: Does the read length matter? Genome Res. 19:336‐346.
   Li, H., Handsaker, B., Wysoker, A., Fennel, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and the 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078‐2079.
   Li, R., Zhu, H., and Wang, J. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20:265‐272.
   Kurtz, S., Narechania, A., Stein, J., and Ware D. 2008. A new method to compute K‐mer frequencies and its application to annotate large repetitive plant genomes. BMC Genomics 9:517.
   Pevzner, P.A., Tang, H., and Waterman, M.S. 2001. An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. U.S.A. 98: 9748‐9753.
   Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J., and Birol, I. 2009. ABySS: A parallel assembler for short read sequence data. Genome Res. 19:1117‐1123.
   Zerbino, D.R. and Birney, E. 2008. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18:821‐829.
   Zerbino, D.R., McEwen, G.K., Margulies, E.H., and Birney, E. 2010. Pebble and rock band: Heuristic resolution of repeats and scaffolding in the velvet short‐read de novo assembler. PLoS ONE 4:e8407.
Key References
  Zerbino and Birney, 2008. See above.
  This first publication mainly described the implementation of de Bruijn graphs within Velvet and the error‐correction algorithm, TourBus.
   Zerbino et al., 2010. See above.
  This follow‐up paper describes how Velvet resolves complex repeats using long reads or paired‐end read information.
Internet Resources
  http://www.ebi.ac.uk/∼zerbino/velvet
  Velvet Web site, where code and information on Velvet can be downloaded.
  http://bioinformatics.net.au/software.shtml
  VelvetOptimiser by Simon Gladman and Torsten Seeman. This wrapper software scans different parameters of Velvet to produce an optimal assembly, as described in .
  http://solidsoftwaretools.com/gf/project/denovotools/
  Colorspace de novo pipeline by Craig Cummings, Vrunda Sheth, and Dima Brinza. These scripts allow you to do all the appropriate colorspace conversions described in (a registration is required, but the software is free).
  http://solidsoftwaretools.com/gf/project/corona/
  The Corona Lite package can be found on this server.
  http://sourceforge.net/apps/mediawiki/amos/
  AMOS suite by the AMOS Consortium. This suite of tools allows the user to manipulate, convert or analyze AFG assembly files.
  http://tools.invitrogen.com/content/sfs/manuals/SOLiD_SAGE_SoftwareGuide.pdf
  Colorspace documentation by Applied Biosystems. This document describes colorspace, and the csfasta format in particular.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library