BreakDancer: Identification of Genomic Structural Variation from Paired‐End Read Mapping

Xian Fan1, Travis E. Abbott2, David Larson2, Ken Chen1

1 Department of Bioinformatics and Computational Biology, The University of Texas M.D. Anderson Cancer Center, Houston, Texas, 2 The Genome Institute at Washington University, St. Louis, Missouri
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 15.6
DOI:  10.1002/0471250953.bi1506s45
Online Posting Date:  March, 2014
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

The advent of next‐generation sequencing data has made it possible to cost‐effectively detect and characterize genomic variation in human genomes. Structural variation, including deletion, duplication, insertion, inversion, and translocation, is of great importance to human genetics due to its association with many genetic diseases. BreakDancer is a bioinformatics tool that relates paired‐end read alignments from a test genome to the reference genome for the purpose of comprehensively and accurately detecting various types of structural variation. Curr. Protoc. Bioinform. 45:15.6.1‐15.6.11. © 2014 by John Wiley & Sons, Inc.

Keywords: genomics; next‐generation sequencing; BreakDancer; structural variation; discordant read pair

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Using BreakDancer to Identify SVs in a Single Genome
  • Basic Protocol 2: Using BreakDancer to Identify Somatic SVs in Matched Tumor and Normal Genomes
  • Basic Protocol 3: Using BreakDancer to Identify Segregating SVs in a Population of Samples
  • Support Protocol 1: Downloading and Installing BreakDancer
  • Support Protocol 2: Download and Install Perl Modules from CPAN
  • Support Protocol 3: Quality‐Control Checks
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: Using BreakDancer to Identify SVs in a Single Genome

  Necessary ResourcesHardware
  • A 64‐bit Linux cluster is preferred. The memory requirement depends on the size of the BAM file and the extent of SVs. For a cancer genome sequenced at a coverage of 30×, usually less than 4 GB memory is needed for a whole‐genome analysis. BreakDancer can also be run on a desktop or laptop computer that has a Unix‐like operating system such as MacOS X.
Software
  • The BreakDancer package. Users should download and install the package, as described in protocol 4.
  • CPAN's Statistics::Descriptive and GD::Graph modules, which are discussed in protocol 5.

Basic Protocol 2: Using BreakDancer to Identify Somatic SVs in Matched Tumor and Normal Genomes

  Necessary ResourcesHardware
  • A 64‐bit Linux cluster is preferred. The memory requirement depends on the size of the BAM file and the extent of SVs. For a cancer genome sequenced at a coverage of 30×, usually less than 4 GB memory is needed for a whole‐genome analysis. BreakDancer can also be run on a desktop or laptop computer that has a Unix‐like operating system such as MacOS X.
Software
  • The BreakDancer package. Users should download and install the package, as described in protocol 4.
  • CPAN's Statistics::Descriptive and GD::Graph modules, which are discussed in protocol 5.

Basic Protocol 3: Using BreakDancer to Identify Segregating SVs in a Population of Samples

  Necessary ResourcesHardware
  • A 64‐bit Linux cluster is preferred. The memory requirement depends on the size of the BAM file and the extent of SVs. For a cancer genome sequenced at a coverage of 30×, usually less than 4 GB memory is needed for a whole‐genome analysis. BreakDancer can also be run on a desktop or laptop computer that has a Unix‐like operating system such as MacOS X.
Software
  • The BreakDancer package. Users should download and install the package, as described in protocol 4.
  • CPAN's Statistics::Descriptive and GD::Graph modules, which are discussed in protocol 5.

Support Protocol 1: Downloading and Installing BreakDancer

  Necessary ResourcesHardware
  • A computer running Linux or OS X with at least 4 GB of RAM.
Software
  • Git
  • A C/C++ compiler such as GCC or Clang.
  • cmake v2.8 or above (http://www.cmake.org)
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

  Barretina, J., Caponigro, G., Stransky, N., Venkatesan, K., Margolin, A.A., Kim, S., and Sonkin, D. 2012. The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity. Nature 7391:603‐607.
  Chen, K., Wallis, J.W., McLellan, M.D., Larson, D.E., Kalicki, J.M., Pohl, C.S., and Mardis, E.R. 2009. BreakDancer: An algorithm for high‐resolution mapping of genomic structural variation. Nat. Methods 6:677‐681.
  Ding, L., Ellis, M.J., Li, S., Larson, D.E., Chen, K., Wallis, J.W., and Fulton, L.L. 2010. Genome remodelling in a basal‐like breast cancer metastasis and xenograft. Nature 464:999‐1005.
  Fan, X., Nakhleh, L., and Chen, K. 2013. Integrated Genotyping of Structural Variation. 1st IEEE Global Conference on Signal and Information Processing, pp. 47‐48. IEEE, New York.
  Feuk, L., Carson, A.R., and Scherer, S.W. 2006. Structural variation in the human genome. Nat. Rev. Genet. 7:85‐97.
  Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. 2009. Ultrafast and memory‐efficient alignment of short DNA sequences to the human genome. Genome Biol. 10:R25.
  Li, H. and Durbin, R. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics (Oxford) 25:1754‐1760.
  Lupski, J.R. 2007. Clinical implications of basic research: Structural variation in the human genome. N. Engl. J. Med. 356:1169‐1171.
  Mills, R.E., Walter, K., Stewart, C., Handsaker, R.E., Chen, K., Alkan, C., and Cheetham, R.K. 2011. Mapping copy number variation by population‐scale genome sequencing. Nature 470:59‐65.
  Pleasance, E.D., Cheetham, R.K., Stephens, P.J., McBride, D.J., Humphray, S.J., Greenman, C.D., Varela, I., Lin, M.L., Ordóñez, G.R., Bignell, G.R., Ye, K., Alipaz, J., Bauer, M.J., Beare, D., Butler, A., Carter, R.J., Chen, L., Cox, A.J., Edkins, S., Kokko‐Gonzales, P.I., Gormley, N.A., Grocock, R.J., Haudenschild, C.D., Hims, M.M., James, T., Jia, M., Kingsbury, Z., Leroy, C., Marshall, J., Menzies, A., Mudie, L.J., Ning, Z., Royce, T., Schulz‐Trieglaff, O.B., Spiridou, A., Stebbings, L.A., Szajkowski, L., Teague, J., Williamson, D., Chin, L., Ross, M.T., Campbell, P.J., Bentley, D.R., Futreal, P.A., and Stratton, M.R. 2010. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463:191‐196.
  Sharp, A.J., Cheng, Z., and Eichler, E.E. 2006. Structural variation of the human genome. Ann. Rev.Genom. Hum. Genet. 7:407‐442.
  Wang, K., Singh, D., Zeng, Z., Coleman, S.J., Huang, Y., Savich, G.L., and Liu, J. 2010. MapSplice: Accurate mapping of RNA‐seq reads for splice junction discovery. Nucleic Acids Res. 38:e178.
  Welch, J.S., Ley, T.J., Link, D.C., Miller, C.A., Larson, D.E., Koboldt, D.C., and Xia, J. 2012. The origin and evolution of mutations in Acute Myeloid Leukemia. Cell 150:264‐278.
Internet Resources
  http://sourceforge.net/projects/samtools/files/samtools/0.1.6/
  A link to download Samtools 0.1.6, which is being supported by BreakDancer1.1.2.
  https://github.com/genome/breakdancer.git
  Links to download BreakDancer package.
  http://sourceforge.net/projects/breakdancer/files/
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library