ascatNgs: Identifying Somatically Acquired Copy‐Number Alterations from Whole‐Genome Sequencing Data

Keiran M. Raine1, Peter Van Loo2, David C. Wedge3, David Jones1, Andrew Menzies1, Adam P. Butler1, Jon W. Teague1, Patrick Tarpey1, Serena Nik‐Zainal1, Peter J. Campbell1

1 Cancer Genome Project, Wellcome Trust Sanger Institute, Cambridge, 2 The Francis Crick Institute, Lincoln's Inn Fields Laboratory, London, 3 Oxford Big Data Institute, Wellcome Trust Centre for Human Genetics, Oxford
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 15.9
DOI:  10.1002/cpbi.17
Online Posting Date:  December, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


We have developed ascatNgs to aid researchers in carrying out Allele‐Specific Copy number Analysis of Tumours (ASCAT). ASCAT is capable of detecting DNA copy number changes affecting a tumor genome when comparing to a matched normal sample. Additionally, the algorithm estimates the amount of tumor DNA in the sample, known as Aberrant Cell Fraction (ACF). ASCAT itself is an R‐package which requires the generation of many file types. Here, we present a suite of tools to help handle this for the user. Our code is available on our GitHub site ( This unit describes both ‘one‐shot’ execution and approaches more suitable for large‐scale compute farms. © 2016 by John Wiley & Sons, Inc.

Keywords: somatic; sequencing; cancer; copy‐number

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Calling Copy Number Segments with A Single Command for A Tumor/Normal Sample Pair
  • Alternate Protocol 1: Automatic Gender Determination
  • Alternate Protocol 2: Using ascatNgs with Compute Farm Infrastructure
  • Support Protocol 1: Installation of acatNgs and Dependencies
  • Support Protocol 2: Static Reference Files
  • Commentary
  • Literature Cited
  • Figures
  • Tables
PDF or HTML at Wiley Online Library


Basic Protocol 1: Calling Copy Number Segments with A Single Command for A Tumor/Normal Sample Pair

  Necessary Resources
  • A small set of Y‐specific loci needs to be provided. These are required to be determined on a species/assembly basis. In the case of Human GRCh37, these are included in the ascatNgs distribution under: ∼/perl/share/gender/GRCh37d5_Y.loci.
  • The selected loci should reliably have no reads mapped when data is from a female
  • Once determined a simple tab delimited file is created:
  • The file does not need to be sorted

Alternate Protocol 1: Automatic Gender Determination

  Necessary Resources
  • See protocol 1Basic Protocol and protocol 2; however, individual steps have different requirements that need modification on a per species/build basis

Alternate Protocol 2: Using ascatNgs with Compute Farm Infrastructure

  Necessary Resources
  • Linux‐based system with Web access
PDF or HTML at Wiley Online Library



Literature Cited

  Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., McVean, G., Durbin, R., and 1000 Genomes Project Analysis Group. 2011. The variant call format and VCFtools. Bioinformatics 27:2156‐2158. doi: 10.1093/bioinformatics/btr330.
  Fritz, M.H.‐Y., Leinonen, R., Cochrane, G., and Birney, E. 2011. Efficient storage of high throughput DNA sequencing data using reference‐based compression. Genome Res. 21:734‐740. doi: 10.1101/gr.114819.110.
  Li, H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA‐MEM. arXiv:1303.3997 [q‐bio]. Available at:
  Li, H. and Durbin, R. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics (Oxford, England) 25:1754‐1760. doi: 10.1093/bioinformatics/btp324.
  Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., and Durbin, R. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078‐2079. doi: 10.1093/bioinformatics/btp352.
  Nik‐Zainal, S., Davies, H., Staaf, J., Ramakrishna, M., Glodzik, D., Zou, X., Martincorena, I., Alexandrov, L.B., Martin, S., Wedge, D.C., Van Loo, P., Ju, Y.S., Smid, M., Brinkman, A.B., Morganella, S., Aure, M.R., Lingjærde, O.C., Langerød, A., Ringnér, M., Ahn, S.‐M., Boyault, S., Brock, J.E., Broeks, A., Butler, A., Desmedt, C., Dirix, L., Dronov, S., Fatima, A., Foekens, J.A., Gerstung, M., Hooijer, G.K.J., Jang, S.J., Jones, D.R., Kim, H.‐Y., King, T.A., Krishnamurthy, S., Lee, H.J., Lee, J.‐Y., Li, Y., McLaren, S., Menzies, A., Mustonen, V., O'Meara, S., Pauporté, I., Pivot, X., Purdie, C.A., Raine, K., Ramakrishnan, K., Rodríguez‐González, F.G., Romieu, G., Sieuwerts, A.M., Simpson, P.T., Shepherd, R., Stebbings, L., Stefansson, O.A., Teague, J., Tommasi, S., Treilleux, I., Van den Eynden, G.G., Vermeulen, P., Vincent‐Salomon, A., Yates, L., Caldas, C., van't Veer, L., Tutt, A., Knappskog, S., Tan, B.K.T., Jonkers, J., Borg, Å., Ueno, N.T., Sotiriou, C., Viari, A., Futreal, P.A., Campbell, P.J., Span, P.N., Van Laere, S., Lakhani, S.R., Eyfjord, J.E., Thompson, A.M., Birney, E., Stunnenberg, H.G., van de Vijver, M.J., Martens, J.W.M., Børresen‐Dale, A.‐L., Richardson, A.L., Kong, G., Thomas, G., and Stratton, M.R. 2016. Landscape of somatic mutations in 560 breast cancer whole‐genome sequences. Nature 534:47‐54. doi: 10.1038/nature17676.
  Pleasance, E.D., Cheetham, R.K., Stephens, P.J., McBride, D.J., Humphray, S.J., Greenman, C.D., Varela, I., Lin, M.‐L., Ordóñez, G.R., Bignell, G.R., Ye, K., Alipaz, J., Bauer, M.J., Beare, D., Butler, A., Carter, R.J., Chen, L., Cox, A.J., Edkins, S., Kokko‐Gonzales, P.I., Gormley, N.A., Grocock, R.J., Haudenschild, C.D., Hims, M.M., James, T., Jia, M., Kingsbury, Z., Leroy, C., Marshall, J., Menzies, A., Mudie, L.J., Ning, Z., Royce, T., Schulz‐Trieglaff, O.B., Spiridou, A., Stebbings, L.A., Szajkowski, L., Teague, J., Williamson, D., Chin, L., Ross, M.T., Campbell, P.J., Bentley, D.R., Futreal, P.A., and Stratton, M.R. 2010. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463:191‐196. doi: 10.1038/nature08658.
  Van Loo, P., Nilsen, G., Nordgard, S., Vollan, H., Børresen‐Dale, A.‐L., Kristensen, V., and Lingjærde, O. 2012. Analyzing cancer samples with SNP arrays. In Next Generation Microarray Bioinformatics Methods in Molecular Biology (J. Wang, A.C. Tan, and T. Tian, eds.) pp. 57‐72. Humana Press, Totowa, N.J. Available at:‐1‐61779‐400‐1_4.
  Van Loo, P., Nordgard, S.H., Lingjærde, O.C., Russnes, H.G., Rye, I.H., Sun, W., Weigman, V.J., Marynen, P., Zetterberg, A., Naume, B., Perou, C.M., Børresen‐Dale, A.‐L., and Kristensen, V.N. 2010. Allele‐specific copy number analysis of tumors. Proc. Natl. Acad. Sci. 107:16910‐16915. doi: 10.1073/pnas.1009843107.
Internet Resources
  Repository for Wellcome Trust Sanger Institute Cancer Genome Project public projects.
  ascatNgs Web site, linking to repository.‐z‐researchers/researchers‐v‐y/peter‐van‐loo/software/
  ASCAT Web site.‐CancerGenomics/ascat
  Repository for the core ASCAT algorithm.
PDF or HTML at Wiley Online Library