Using SomaticSniper to Detect Somatic Single Nucleotide Variants

David E. Larson1, Travis E. Abbott1, Richard K. Wilson1

1 The Genome Institute at Washington University, St. Louis, Missouri
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 15.5
DOI:  10.1002/0471250953.bi1505s45
Online Posting Date:  March, 2014
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Detecting somatic single nucleotide variants (SNVs) is an essential component of cancer research with next‐generation sequencing data. This unit describes how to run the SomaticSniper somatic SNV detector and then filter the output to eliminate most false positives. It also includes support protocols detailing the compilation of the software. Curr. Protoc. Bioinform. 45:15.5.1‐15.5.8. © 2014 by John Wiley & Sons, Inc.

Keywords: next‐generation sequencing; somatic variant calling; SomaticSniper

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Calling Variants Using SomaticSniper
  • Support Protocol 1: Compiling SomaticSniper from Source Code
  • Support Protocol 2: Compiling bam‐readcount from Source Code
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: Calling Variants Using SomaticSniper

  Necessary ResourcesHardware
  • A computer running Linux or Mac OS X with at least 1 GB of RAM
Software
  • The SomaticSniper program (see protocol 2 for information on compiling from source) and the bam‐readcount program (see protocol 3 for information on compiling from source). Samtools is also utilized for filtering. An older version is provided (see protocol 2), but versions up to 0.1.16 are supported by the filtering scripts.
Files
  • A pair of BAM files (tumor and normal), as well as the FASTA file containing the reference sequence that the BAM files are aligned to. Note that both BAM files must be aligned to the same reference sequence and should be indexed by Picard or Samtools. SomaticSniper does not require any special flags produced by specific aligners, but the mapping quality produced by the aligner is utilized to make the calls.
  • SomaticSniper is typically run on BAM files that have neither been base‐quality recalibrated nor locally realigned around indels (such as with the GATK). Either of these preprocessing steps can be optionally performed before running SomaticSniper, but we have observed some decrease in sensitivity after base‐quality recalibration.
  • It is important to have good coverage for detection of somatic variants. SomaticSniper was designed to work on whole genomes sequenced to 30× depth of coverage. Higher depths of coverage will improve variant calling in most cases, but will also increase some types of SomaticSniper false positives. Example files for the sections below can be obtained from https://github.com/genome/somatic‐snv‐test‐data.

Support Protocol 1: Compiling SomaticSniper from Source Code

  Necessary ResourcesHardware
  • A computer running Linux or Mac OS X with at least 1 GB of RAM
Software
  • The following programs are required to compile SomaticSniper: git, cmake (v2.8 or greater), the make utility, and a C compiler such as gcc. For Debian‐based distributions, these can be installed using:
    • sudo apt‐get install cmake build‐essential zlib1g‐dev libncurses‐dev git‐core
  • For RedHat‐based distributions, this can be done using:
    • sudo yum groupinstall “Development tools”
    • sudo yum install zlib‐devel ncurses‐devel cmake.
  • Note that RHEL and CentOS 6.4 both ship with cmake versions earlier than 2.8 and cmake will need to be installed from a source tarball obtained from http://www.cmake.org/

Support Protocol 2: Compiling bam‐readcount from Source Code

  Necessary ResourcesHardware
  • A computer running Linux or Mac OS X with at least 1 GB of RAM.
Software
  • The following programs are required to compile bam‐readcount: git, cmake (v2.8 or greater), the make utility, and a C compiler such as gcc. The packages described in protocol 2 can be used to meet these dependencies.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

  Christoforides, A., Carpten, J.D., Weiss, G.J., Demeure, M.J., Von Hoff, D.D., and Craig, D.W. 2013. Identification of somatic mutations in cancer through Bayesian‐based analysis of sequenced genome pairs. BMC Genomics 14:302.
  Cibulskis, K., Lawrence, M.S., Carter, S.L., Sivachenko, A., Jaffe, D., Sougnez, C., Gabriel, S., Meyerson, M., Lander, E.S., and Getz, G. 2013. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31:213‐219.
  Koboldt, D.C., Zhang, Q., Larson, D.E., Shen, D., McLellan, M.D., Lin, L., Miller, C.A., Mardis, E.R., Ding, L., and Wilson, R.K. 2012. VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22:568‐576.
  Larson, D.E., Harris, C.C., Chen, K., Koboldt, D.C., Abbott, T.E., Dooling, D.J., Ley, T.J., Mardis, E.R., Wilson, R.K., and Ding, L. 2012. SomaticSniper: Identification of somatic point mutations in whole genome sequencing data. Bioinformatics 28:311‐317.
  Li, H. 2011. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27:2987‐2993.
  Li, H., Ruan, J., and Durbin, R. 2008. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18:1851‐1858.
  Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., and Durbin, R. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078‐2079.
  Pleasance, E.D., Cheetham, R.K., Stephens, P.J., McBride, D.J., Humphray, S.J., Greenman, C.D., Varela, I., Lin, M.L., Ordonez, G.R., Bignell, G.R., Ye, K., Alipaz, J., Bauer, M.J., Beare, D., Butler, A., Carter, R.J., Chen, L., Cox, A.J., Edkins, S., Kokko‐Gonzales, P.I., Gormley, N.A., Grocock, R.J., Haudenschild, C.D., Hims, M.M., James, T., Jia, M., Kingsbury, Z., Leroy, C., Marshall, J., Menzies, A., Mudie, L.J., Ning, Z., Royce, T., Schulz‐Trieglaff, O.B., Spiridou, A., Stebbings, L.A., Szajkowski, L., Teague, J., Williamson, D., Chin, L., Ross, M.T., Campbell, P.J., Bentley, D.R., Futreal, P.A., and Stratton, M.R. 2010. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463:191‐196.
  Saunders, C.T., Wong, W.S., Swamy, S., Becq, J., Murray, L.J., and Cheetham, R.K. 2012. Strelka: accurate somatic small‐variant calling from sequenced tumor‐normal sample pairs. Bioinformatics 28:1811‐1817.
  Shiraishi, Y., Sato, Y., Chiba, K., Okuno, Y., Nagata, Y., Yoshida, K., Shiba, N., Hayashi, Y., Kume, H., Homma, Y., Sanada, M., Ogawa, S., and Miyano, S. 2013. An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data. Nucleic Acids Res. 41:e89.
Internet Resources
  http://gmt.genome.wustl.edu/somatic‐sniper
  This is the homepage of the SomaticSniper algorithm and contains links to the source code, documentation, and information on how to receive support.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library