cgpPindel: Identifying Somatically Acquired Insertion and Deletion Events from Paired End Sequencing

Keiran M. Raine1, Jonathan Hinton1, Adam P. Butler1, Jon W. Teague1, Helen Davies1, Patrick Tarpey1, Serena Nik‐Zainal1, Peter J. Campbell1

1 Cancer Genome Project, Wellcome Trust Sanger Institute, Cambridge
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 15.7
DOI:  10.1002/0471250953.bi1507s52
Online Posting Date:  December, 2015
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


cgpPindel is a modified version of Pindel that is optimized for detecting somatic insertions and deletions (indels) in cancer genomes and other samples compared to a reference control. Post‐hoc filters remove false positive calls, resulting in a high‐quality dataset for downstream analysis. This unit provides concise instructions for both a simple ‘one‐shot’ execution of cgpPindel and a more detailed approach suitable for large‐scale compute farms. © 2015 by John Wiley & Sons, Inc.

Keywords: somatic; sequencing; Pindel; cancer

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Strategic Planning
  • Basic Protocol 1: Calling Indels with a Single Command for a Tumor/Normal Sample Pair
  • Alternate Protocol 1: Processing Other Sequencing Types
  • Alternate Protocol 2: Using cgpPindel with Compute Farm Infrastructure
  • Support Protocol 1: Installation of cgpPindel and Dependencies
  • Support Protocol 2: Static Reference Files
  • Guidelines for Understanding Results
  • Commentary
  • Figures
  • Tables
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
  Li, H. 2011. Tabix: Fast retrieval of sequence features from generic TAB‐delimited files. Bioinformatics 27:718‐719.
  Li, H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA‐MEM. Quant. Biol. arXiv:1303.3997 [q‐bio]. Available at
  Li, H. and Durbin, R. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics (Oxford, England) 25:1754‐1760. doi: 10.1093/bioinformatics/btp324.
  Pleasance, E.D., Cheetham, R.K., Stephens, P.J., McBride, D.J., Humphray, S.J., Greenman, C.D., Varela, I., Lin, M.‐L., Ordóñez, G.R., Bignell, G.R., Ye, K., Alipaz, J., Bauer, M.J., Beare, D., Butler, A., Carter, R.J., Chen, L., Cox, A.J., Edkins, S., Kokko‐Gonzales, P.I., Gormley, N.A., Grocock, R.J., Haudenschild, C.D., Hims, M.M., James, T., Jia, M., Kingsbury, Z., Leroy, C., Marshall, J., Menzies, A., Mudie, L.J., Ning, Z., Royce, T., Schulz‐Trieglaff, O.B., Spiridou, A., Stebbings, L.A., Szajkowski, L., Teague, J., Williamson, D., Chin, L., Ross, M.T., Campbell, P.J., Bentley, D.R., Futreal, P.A., and Stratton, M.R. 2010. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463:191‐196. doi: 10.1038/nature08658.
  Skinner, M.E., Uzilov, A.V., Stein, L.D., Mungall, C.J., and Holmes, I.H. 2009. JBrowse: A next‐generation genome browser. Genome Res. 19:1630‐1638. doi: 10.1101/gr.094607.109.
  Ye, K., Schulz, M.H., Long, Q., Apweiler, R., and Ning, Z. 2009. Pindel: A pattern growth approach to detect break points of large deletions and medium sized insertions from paired‐end short reads. Bioinformatics (Oxford, England) 25:2865‐2871. doi: 10.1093/bioinformatics/btp394.
Internet Resources
  Repository for Wellcome Trust Sanger Institute Cancer Genome Project public projects.
  Core Pindel site.‐bin/hgTables
  UCSC Genome Browser Table Browser.
  VCF file format specification.‐specs/SAMv1.pdf
  SAM format specification.
PDF or HTML at Wiley Online Library