A Survey of Copy‐Number Variation Detection Tools Based on High‐Throughput Sequencing Data

Ruibin Xi1, Semin Lee2, Peter J. Park3

1 School of Mathematical Sciences and Center for Statistical Science, Peking University, Beijng, China, 2 Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 3 Division of Genetics, Brigham and Women's Hospital, Boston, Massachusetts
Publication Name:  Current Protocols in Human Genetics
Unit Number:  Unit 7.19
DOI:  10.1002/0471142905.hg0719s75
Online Posting Date:  October, 2012
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Copy‐number variation (CNV) is a major class of genomic variation with potentially important functional consequences in both normal and diseased populations. Remarkable advances in development of next‐generation sequencing (NGS) platforms provide an unprecedented opportunity for accurate, high‐resolution characterization of CNVs. In this unit, we give an overview of available computational tools for detection of CNVs and discuss comparative advantages and disadvantages of different approaches. Curr. Protoc. Hum. Genet. 75:7.19.1‐7.19.15. © 2012 by John Wiley & Sons, Inc.

Keywords: structural variation; insertion; deletion; indel; inversion; translocation

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Overview of CNV Detection Approaches Based on NGS Data
  • Discussion
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

Literature Cited
   Abel, H.J., Duncavage, E.J., Becker, N., Armstrong, J.R., Magrini, V.J., and Pfeifer, J.D. 2010. SLOPE: A quick and accurate method for locating non‐SNP structural variation from targeted next‐generation sequence data. Bioinformatics 26:2684‐2688.
   Abyzov, A., Urban, A.E., Snyder, M., and Gerstein, M. 2011. CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21:974‐984.
   Alkan, C., Sajjadian, S., and Eichler, E.E. 2011. Limitations of next‐generation genome sequence assembly. Nat. Methods 8:61‐65.
   Barrett, J.C., Hansoul, S., Nicolae, D.L., Cho, J.H., Duerr, R.H., Rioux, J.D., Brant, S.R., Silverberg, M.S., Taylor, K.D., Barmada, M.M., Bitton, A., Dassopoulos, T., Datta, L.W., Green, T., Griffiths, A.M., Kistner, E.O., Murtha, M.T., Regueiro, M.D., Rotter, J.I., Schumm, L.P., Steinhart, A.H., Targan, S.R., Xavier, R.J., Libioulle, C., Sandor, C., Lathrop, M., Belaiche, J., Dewit, O., Gut, I., Heath, S., Laukens, D., Mni, M., Rutgeerts, P., Van, G.A., Zelenika, D., Franchimont, D., Hugot, J.P., de, V.M., Vermeire, S., Louis, E., Cardon, L.R., Anderson, C.A., Drummond, H., Nimmo, E., Ahmad, T., Prescott, N.J., Onnie, C.M., Fisher, S.A., Marchini, J., Ghori, J., Bumpstead, S., Gwilliam, R., Tremelling, M., Deloukas, P., Mansfield, J., Jewell, D., Satsangi, J., Mathew, C.G., Parkes, M., Georges, M., and Daly, M.J. 2008. Genome‐wide association defines more than 30 distinct susceptibility loci for Crohn's Disease. Nat. Genet. 40:955‐962.
   Barrett, J.C., Clayton, D.G., Concannon, P., Akolkar, B., Cooper, J.D., Erlich, H.A., Julier, C., Morahan, G., Nerup, J., Nierras, C., Plagnol, V., Pociot, F., Schuilenburg, H., Smyth, D.J., Stevens, H., Todd, J.A., Walker, N.M. and Rich, S.S. 2009. Genome‐wide association study and meta‐analysis find that over 40 loci affect risk of type 1 diabetes. Nat. Genet. 41:703‐707.
   Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., Brown, C.G., Hall, K.P., Evers, D.J., Barnes, C.L., Bignell, H.R., Boutell, J.M., Bryant, J., Carter, R.J., Keira, C.R., Cox, A.J., Ellis, D.J., Flatbush, M.R., Gormley, N.A., Humphray, S.J., Irving, L.J., Karbelashvili, M.S., Kirk, S.M., Li, H., Liu, X., Maisinger, K.S., Murray, L.J., Obradovic, B., Ost, T., Parkinson, M.L., Pratt, M.R., Rasolonjatovo, I.M., Reed, M.T., Rigatti, R., Rodighiero, C., Ross, M.T., Sabot, A., Sankar, S.V., Scally, A., Schroth, G.P., Smith, M.E., Smith, V.P., Spiridou, A., Torrance, P.E., Tzonev, S.S., Vermaas, E.H., Walter, K., Wu, X., Zhang, L., Alam, M.D., Anastasi, C., Aniebo, I.C., Bailey, D.M., Bancarz, I.R., Banerjee, S., Barbour, S.G., Baybayan, P.A., Benoit, V.A., Benson, K.F., Bevis, C., Black, P.J., Boodhun, A., Brennan, J.S., Bridgham, J.A., Brown, R.C., Brown, A.A., Buermann, D.H., Bundu, A.A., Burrows, J.C., Carter, N.P., Castillo, N., Chiara, E.C., Chang, S., Neil, C.R., Crake, N.R., Dada, O.O., Diakoumakos, K.D., Dominguez‐Fernandez, B., Earnshaw, D.J., Egbujor, U.C., Elmore, D.W., Etchin, S.S., Ewan, M.R., Fedurco, M., Fraser, L.J., Fuentes, Fajardo, K.V., Scott, F.W., George, D., Gietzen, K.J., Goddard, C.P., Golda, G.S., Granieri, P.A., Green, D.E., Gustafson, D.L., Hansen, N.F., Harnish, K., Haudenschild, C.D., Heyer, N.I., Hims, M.M., Ho, J.T., Horgan, A.M., Hoschler, K., Hurwitz, S., Ivanov, D.V., Johnson, M.Q., James, T., Huw Jones, T.A., Kang, G.D., Kerelska, T.H., Kersey, A.D., Khrebtukova, I., Kindwall, A.P., Kingsbury, Z., Kokko‐Gonzales, P.I., Kumar, A., Laurent, M.A., Lawley, C.T., Lee, S.E., Lee, X., Liao, A.K., Loch, J.A., Lok, M., Luo, S., Mammen, R.M., Martin, J.W., McCauley, P.G., McNitt, P., Mehta, P., Moon, K.W., Mullens, J.W., Newington, T., Ning, Z., Ling, N.B., Novo, S.M., O'Neill, M.J., Osborne, M.A., Osnowski, A., Ostadan, O., Paraschos, L.L., Pickering, L., Pike, A.C., Pike, A.C., Chris, P.D., Pliskin, D.P., Podhasky, J., Quijano, V.J., Raczy, C., Rae, V.H., Rawlings, S.R., Chiva, R.A., Roe, P.M., Rogers, J., Rogert Bacigalupo, M.C., Romanov, N., Romieu, A., Roth, R.K., Rourke, N.J., Ruediger, S.T., Rusman, E., Sanches‐Kuiper, R.M., Schenker, M.R., Seoane, J.M., Shaw, R.J., Shiver, M.K., Short, S.W., Sizto, N.L., Sluis, J.P., Smith, M.A., Ernest Sohna, S.J., Spence, E.J., Stevens, K., Sutton, N., Szajkowski, L., Tregidgo, C.L., Turcatti, G., Vandevondele, S., Verhovsky, Y., Virk, S.M., Wakelin, S., Walcott, G.C., Wang, J., Worsley, G.J., Yan, J., Yau, L., Zuerlein, M., Rogers, J., Mullikin, J.C., Hurles, M.E., McCooke, N.J., West, J.S., Oaks, F.L., Lundberg, P.L., Klenerman, D., Durbin, R., and Smith, A.J. 2008. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53‐59.
   Bochukova, E.G., Huang, N., Keogh, J., Henning, E., Purmann, C., Blaszczyk, K., Saeed, S., Hamilton‐Shield, J., Clayton‐Smith, J., O'Rahilly, S., Hurles, M.E., and Farooqi, I.S. 2010. Large, rare chromosomal deletions associated with severe early‐onset obesity. Nature 463:666‐670.
   Boeva, V., Zinovyev, A., Bleakley, K., Vert, J.P., Janoueix‐Lerosey, I., Delattre, O., and Barillot, E. 2011. Control‐free calling of copy number alterations in deep‐sequencing data using GC‐content normalization. Bioinformatics 27:268‐269.
   Cancer Genome Atlas Research Network. 2008. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061‐1068.
   Chaisson, M.J., Brinza, D., and Pevzner, P.A. 2009. De novo fragment assembly with short mate‐paired reads: Does the read length matter? Genome Res. 19:336‐346.
   Chen, K., Wallis, J.W., McLellan, M.D., Larson, D.E., Kalicki, J.M., Pohl, C.S., McGrath, S.D., Wendl, M.C., Zhang, Q., Locke, D.P., Shi, X., Fulton, R.S., Ley, T.J., Wilson, R.K., Ding, L., and Mardis, E.R. 2009. BreakDancer: An algorithm for high‐resolution mapping of genomic structural variation. Nat. Methods 6:677‐681.
   Chiang, D.Y., Getz, G., Jaffe, D.B., O'Kelly, M.J., Zhao, X., Carter, S.L., Russ, C., Nusbaum, C., Meyerson, M., and Lander, E.S. 2009. High‐resolution mapping of copy‐number alterations with massively parallel sequencing. Nat. Methods 6:99‐103.
   Conrad, D.F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Aerts, J., Andrews, T.D., Barnes, C., Campbell, P., Fitzgerald, T., Hu, M., Ihm, C.H., Kristiansson, K., Macarthur, D.G., Macdonald, J.R., Onyiah, I., Pang, A.W., Robson, S., Stirrups, K., Valsesia, A., Walter, K., Wei, J., Tyler‐Smith, C., Carter, N.P., Lee, C., Scherer, S.W., and Hurles, M.E. 2010. Origins and functional impact of copy number variation in the human genome. Nature 464:704‐712.
   Deng, X. 2011. SeqGene: A comprehensive software solution for mining exome‐ and transcriptome‐sequencing data. BMC Bioinformatics 12:267.
   Diskin, S.J., Hou, C., Glessner, J.T., Attiyeh, E.F., Laudenslager, M., Bosse, K., Cole, K., Mosse, Y.P., Wood, A., Lynch, J.E., Pecor, K., Diamond, M., Winter, C., Wang, K., Kim, C., Geiger, E.A., McGrady, P.W., Blakemore, A.I., London, W.B., Shaikh, T.H., Bradfield, J., Grant, S.F., Li, H., Devoto, M., Rappaport, E.R., Hakonarson, H., and Maris, J.M. 2009. Copy number variation at 1q21.1 associated with neuroblastoma. Nature 459:987‐991.
   Fanciulli, M., Norsworthy, P.J., Petretto, E., Dong, R., Harper, L., Kamesh, L., Heward, J.M., Gough, S.C., de, S.A., Blakemore, A.I., Froguel, P., Owen, C.J., Pearce, S.H., Teixeira, L., Guillevin, L., Graham, D.S., Pusey, C.D., Cook, H.T., Vyse, T.J., and Aitman, T.J. 2007. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ‐specific, autoimmunity. Nat. Genet. 39:721‐723.
   Fiegler, H., Redon, R., Andrews, D., Scott, C., Andrews, R., Carder, C., Clark, R., Dovey, O., Ellis, P., Feuk, L., French, L., Hunt, P., Kalaitzopoulos, D., Larkin, J., Montgomery, L., Perry, G.H., Plumb, B.W., Porter, K., Rigby, R.E., Rigler, D., Valsesia, A., Langford, C., Humphray, S.J., Scherer, S.W., Lee, C., Hurles, M.E., and Carter, N.P. 2006. Accurate and reliable high‐throughput detection of copy number variation in the human genome. Genome Res. 16:1566‐1574.
   Garraway, L.A., Widlund, H.R., Rubin, M.A., Getz, G., Berger, A.J., Ramaswamy, S., Beroukhim, R., Milner, D.A., Granter, S.R., Du, J., Lee, C., Wagner, S.N., Li, C., Golub, T.R., Rimm, D.L., Meyerson, M.L., Fisher, D.E., and Sellers, W.R. 2005. Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature 436:117‐122.
   Gnerre, S., Maccallum, I., Przybylski, D., Ribeiro, F.J., Burton, J.N., Walker, B.J., Sharpe, T., Hall, G., Shea, T.P., Sykes, S., Berlin, A.M., Aird, D., Costello, M., Daza, R., Williams, L., Nicol, R., Gnirke, A., Nusbaum, C., Lander, E.S., and Jaffe, D.B. 2011. High‐quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl. Acad. Sci. U.S.A. 108:1513‐1518.
   Gusnanto, A., Wood, H.M., Pawitan, Y., Rabbitts, P. and Berri, S. 2012. Correcting for cancer genome size and tumour cell content enables better estimation of copy number alterations from next‐generation sequence data. Bioinformatics 28:40‐47.
   Hajirasouliha, I., Hormozdiari, F., Alkan, C., Kidd, J.M., Birol, I., Eichler, E.E., and Sahinalp, S.C. 2010. Detection and characterization of novel sequence insertions using paired‐end next‐generation sequencing. Bioinformatics 26:1277‐1283.
   Handsaker, R.E., Korn, J.M., Nemesh, J., and McCarroll, S.A. 2011. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat. Genet. 43:269‐276.
   Haraksingh, R.R., Abyzov, A., Gerstein, M., Urban, A.E., and Snyder, M. 2011. Genome‐wide mapping of copy number variation in humans: Comparative analysis of high resolution array platforms. PLoS One 6:e27859.
   Hormozdiari, F., Alkan, C., Eichler, E.E., and Sahinalp, S.C. 2009. Combinatorial algorithms for structural variation detection in high‐throughput sequenced genomes. Genome Res. 19:1270‐1278.
   Hormozdiari, F., Hajirasouliha, I., McPherson, A., Eichler, E.E. and Sahinalp, S.C. 2011. Simultaneous structural variation discovery among multiple paired‐end sequenced genomes. Genome Res. 21:2203‐2212.
   Iafrate, A.J., Feuk, L., Rivera, M.N., Listewnik, M.L., Donahoe, P.K., Qi, Y., Scherer, S.W., and Lee, C. 2004. Detection of large‐scale variation in the human genome. Nat. Genet. 36:949‐951.
   Ivakhno, S., Royce, T., Cox, A.J., Evers, D.J., Cheetham, R.K., and Tavare, S. 2010. CNAseg‐a novel framework for identification of copy number changes in cancer from second‐generation sequencing data. Bioinformatics 26:3051‐3058.
   Kent, W.J. 2002. BLAT‐the BLAST‐like alignment tool. Genome Res. 12:656‐664.
   Kim, T.M., Luquette, L.J., Xi, R., and Park, P.J. 2010. RSW‐Seq: Algorithm for detection of copy number alterations in deep sequencing data. BMC Bioinformatics 11:432.
   Koboldt, D.C., Zhang, Q., Larson, D.E., Shen, D., McLellan, M.D., Lin, L., Miller, C.A., Mardis, E.R., Ding, L., and Wilson, R.K. 2012. VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22:568‐576.
   Korbel, J.O., Abyzov, A., Mu, X.J., Carriero, N., Cayting, P., Zhang, Z., Snyder, M. and Gerstein, M.B. 2009. PEMer: A computational framework with simulation‐based error models for inferring genomic structural variants from massive paired‐end sequencing data. Genome Biol. 10:R23.
   Kuiper, R.P., Ligtenberg, M.J., Hoogerbrugge, N., and Geurts van, K.A. 2010. Germline copy number variation and cancer risk. Curr. Opin. Genet. Dev. 20:282‐289.
   Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. 2009. Ultrafast and memory‐efficient alignment of short DNA sequences to the human genome. Genome Biol. 10:R25.
   Lee, S., Hormozdiari, F., Alkan, C. and Brudno, M. 2009. MoDIL: Detecting small indels from clone‐end sequencing with mixtures of distributions. Nat. Methods 6:473‐474.
   Lee, S., Xing, E. and Brudno, M. 2010. MoGUL: Detecting common insertions and deletions in a population. In Research in Computational Molecular Biology. pp. 357‐368. Springer.
   Li, H. and Durbin, R. 2009. Fast and accurate short read alignment with Burrows‐Wheeler transform. Bioinformatics 25:1754‐1760.
   Li, H. and Homer, N. 2010. A survey of sequence alignment algorithms for next‐generation sequencing. Brief. Bioinform. 11:473‐483.
   Li, H., Ruan, J., and Durbin, R. 2008. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18:1851‐1858.
   Li, J., Lupat, R., Amarasinghe, K.C., Thompson, E.R., Doyle, M.A., Ryland, G.L., Tothill, R.W., Halgamuge, S.K., Campbell, I.G., and Gorringe, K.L. 2012. CONTRA: Copy number analysis for targeted resequencing. Bioinformatics. 28:1307‐1313
   Li, R., Zhu, H., Ruan, J., Qian, W., Fang, X., Shi, Z., Li, Y., Li, S., Shan, G., Kristiansen, K., Li, S., Yang, H., Wang, J., and Wang, J. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20:265‐272.
   Li, Y., Zheng, H., Luo, R., Wu, H., Zhu, H., Li, R., Cao, H., Wu, B., Huang, S., Shao, H., Ma, H., Zhang, F., Feng, S., Zhang, W., Du, H., Tian, G., Li, J., Zhang, X., Li, S., Bolund, L., Kristiansen, K., de Smith, A.J., Blakemore, A.I., Coin, L.J., Yang, H., Wang, J., and Wang, J. 2011. Structural variation in two human genomes mapped at single‐nucleotide resolution by whole genome de novo assembly. Nat. Biotechnol. 29:723‐730.
   Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A., Berka, J., Braverman, M.S., Chen, Y.J., Chen, Z., Dewell, S.B., Du, L., Fierro, J.M., Gomes, X.V., Godwin, B.C., He, W., Helgesen, S., Ho, C.H., Irzyk, G.P., Jando, S.C., Alenquer, M.L., Jarvie, T.P., Jirage, K.B., Kim, J.B., Knight, J.R., Lanza, J.R., Leamon, J.H., Lefkowitz, S.M., Lei, M., Li, J., Lohman, K.L., Lu, H., Makhijani, V.B., McDade, K.E., McKenna, M.P., Myers, E.W., Nickerson, E., Nobile, J.R., Plant, R., Puc, B.P., Ronan, M.T., Roth, G.T., Sarkis, G.J., Simons, J.F., Simpson, J.W., Srinivasan, M., Tartaro, K.R., Tomasz, A., Vogt, K.A., Volkmer, G.A., Wang, S.H., Wang, Y., Weiner, M.P., Yu, P., Begley, R.F., and Rothberg, J.M. 2005. Genome sequencing in microfabricated high‐density picolitre reactors. Nature 437:376‐380.
   Miller, C.A., Hampton, O., Coarfa, C. and Milosavljevic, A. 2011. ReadDepth: A parallel R package for detecting copy number alterations from short sequencing reads. PLoS One 6:e16327.
   Nicol, J.W., Helt, G.A., Blanchard, S.G. Jr., Raja, A., and Loraine, A.E. 2009. The integrated genome browser: Free software for distribution and exploration of genome‐scale datasets. Bioinformatics 25:2730‐2731.
   Ning, Z., Cox, A.J., and Mullikin, J.C. 2001. SSAHA: A fast search method for large DNA databases. Genome Res. 11:1725‐1729.
   Nord, A.S., Lee, M., King, M.C. and Walsh, T. 2011. Accurate and exact CNV identification from targeted high‐throughput sequence data. BMC Genomics 12:184.
   Paisan‐Ruiz, C., Jain, S., Evans, E.W., Gilks, W.P., Simon, J., van der Brug, M., Lopez de, M.A., Aparicio, S., Gil, A.M., Khan, N., Johnson, J., Martinez, J.R., Nicholl, D., Carrera, I.M., Pena, A.S., de, S.R., Lees, A., Marti‐Masso, J.F., Perez‐Tur, J., Wood, N.W., and Singleton, A.B. 2004. Cloning of the gene containing mutations that cause PARK8‐linked Parkinson's Disease. Neuron 44:595‐600.
   Pinto, D., Darvishi, K., Shi, X., Rajan, D., Rigler, D., Fitzgerald, T., Lionel, A.C., Thiruvahindrapuram, B., Macdonald, J.R., Mills, R., Prasad, A., Noonan, K., Gribble, S., Prigmore, E., Donahoe, P.K., Smith, R.S., Park, J.H., Hurles, M.E., Carter, N.P., Lee, C., Scherer, S.W., and Feuk, L. 2011. Comprehensive assessment of array‐based platforms and calling algorithms for detection of copy number variants. Nat. Biotechnol. 29:512‐520.
   Qi, J. and Zhao, F. 2011. inGAP‐sv: A novel scheme to identify and visualize structural variation from paired end mapping data. Nucleic Acids Res. 39:W567‐W575.
   Quinlan, A.R., Clark, R.A., Sokolova, S., Leibowitz, M.L., Zhang, Y., Hurles, M.E., Mell, J.C. and Hall, I.M. 2010. Genome‐wide mapping and assembly of structural variant breakpoints in the mouse genome. Genome Res. 20:623‐635.
   Ramachandran, A., Micsinai, M. and Pe'er, I. 2011. CONDEX: Copy number detection in exome sequences. In Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference, pp. 87‐93.
   Redon, R., Ishikawa, S., Fitch, K.R., Feuk, L., Perry, G.H., Andrews, T.D., Fiegler, H., Shapero, M.H., Carson, A.R., Chen, W., Cho, E.K., Dallaire, S., Freeman, J.L., Gonzalez, J.R., Gratacos, M., Huang, J., Kalaitzopoulos, D., Komura, D., Macdonald, J.R., Marshall, C.R., Mei, R., Montgomery, L., Nishimura, K., Okamura, K., Shen, F., Somerville, M.J., Tchinda, J., Valsesia, A., Woodwark, C., Yang, F., Zhang, J., Zerjal, T., Zhang, J., Armengol, L., Conrad, D.F., Estivill, X., Tyler‐Smith, C., Carter, N.P., Aburatani, H., Lee, C., Jones, K.W., Scherer, S.W. and Hurles, M.E. 2006. Global variation in copy number in the human genome. Nature 444:444‐454.
   Rigaill, G.J., Cadot, S., Kluin, R.J.C., Xue, Z., Bernards, R., Majewski, I.J., and Wessels, L.F.A. 2012. A regression model for estimating DNA copy number data applied to capture sequencing data. Bioinformatics 28:2357‐2365.
   Robinson, J.T., Thorvaldsdottir, H., Winckler, W., Guttman, M., Lander, E.S., Getz, G., and Mesirov, J.P. 2011. Integrative genomics viewer. Nat. Biotechnol. 29:24‐26.
   Sathirapongsasuti, J.F., Lee, H., Horst, B.A., Brunner, G., Cochran, A.J., Binder, S., Quackenbush, J. and Nelson, S.F. 2011. Exome sequencing‐based copy‐number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics 27:2648‐2654.
   Sebat, J., Lakshmi, B., Malhotra, D., Troge, J., Lese‐Martin, C., Walsh, T., Yamrom, B., Yoon, S., Krasnitz, A., Kendall, J., Leotta, A., Pai, D., Zhang, R., Lee, Y.H., Hicks, J., Spence, S.J., Lee, A.T., Puura, K., Lehtimaki, T., Ledbetter, D., Gregersen, P.K., Bregman, J., Sutcliffe, J.S., Jobanputra, V., Chung, W., Warburton, D., King, M.C., Skuse, D., Geschwind, D.H., Gilliam, T.C., Ye, K., and Wigler, M. 2007. Strong association of de novo copy number mutations with autism. Science 316:445‐449.
   Shendure, J., Porreca, G.J., Reppas, N.B., Lin, X., McCutcheon, J.P., Rosenbaum, A.M., Wang, M.D., Zhang, K., Mitra, R.D., and Church, G.M. 2005. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309:1728‐1732.
   Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J., and Birol, I. 2009. ABySS: A parallel assembler for short read sequence data. Genome Res. 19:1117‐1123.
   Sindi, S., Helman, E., Bashir, A., and Raphael, B.J. 2009. A geometric approach for classification and comparison of structural variants. Bioinformatics 25:i222‐i230.
   Stefansson, H., Rujescu, D., Cichon, S., Pietilainen, O.P., Ingason, A., Steinberg, S., Fossdal, R., Sigurdsson, E., Sigmundsson, T., Buizer‐Voskamp, J.E., Hansen, T., Jakobsen, K.D., Muglia, P., Francks, C., Matthews, P.M., Gylfason, A., Halldorsson, B.V., Gudbjartsson, D., Thorgeirsson, T.E., Sigurdsson, A., Jonasdottir, A., Jonasdottir, A., Bjornsson, A., Mattiasdottir, S., Blondal, T., Haraldsson, M., Magnusdottir, B.B., Giegling, I., Moller, H.J., Hartmann, A., Shianna, K.V., Ge, D., Need, A.C., Crombie, C., Fraser, G., Walker, N., Lonnqvist, J., Suvisaari, J., Tuulio‐Henriksson, A., Paunio, T., Toulopoulou, T., Bramon, E., Di, F.M., Murray, R., Ruggeri, M., Vassos, E., Tosato, S., Walshe, M., Li, T., Vasilescu, C., Muhleisen, T.W., Wang, A.G., Ullum, H., Djurovic, S., Melle, I., Olesen, J., Kiemeney, L.A., Franke, B., Sabatti, C., Freimer, N.B., Gulcher, J.R., Thorsteinsdottir, U., Kong, A., Andreassen, O.A., Ophoff, R.A., Georgi, A., Rietschel, M., Werge, T., Petursson, H., Goldstein, D.B., Nothen, M.M., Peltonen, L., Collier, D.A., St, C.D. and Stefansson, K. 2008. Large recurrent microdeletions associated with schizophrenia. Nature 455:232‐236.
   Steinthorsdottir, V., Thorleifsson, G., Reynisdottir, I., Benediktsson, R., Jonsdottir, T., Walters, G.B., Styrkarsdottir, U., Gretarsdottir, S., Emilsson, V., Ghosh, S., Baker, A., Snorradottir, S., Bjarnason, H., Ng, M.C., Hansen, T., Bagger, Y., Wilensky, R.L., Reilly, M.P., Adeyemo, A., Chen, Y., Zhou, J., Gudnason, V., Chen, G., Huang, H., Lashley, K., Doumatey, A., So, W.Y., Ma, R.C., Andersen, G., Borch‐Johnsen, K., Jorgensen, T., van Vliet‐Ostaptchouk, J.V., Hofker, M.H., Wijmenga, C., Christiansen, C., Rader, D.J., Rotimi, C., Gurney, M., Chan, J.C., Pedersen, O., Sigurdsson, G., Gulcher, J.R., Thorsteinsdottir, U., Kong, A., and Stefansson, K. 2007. A variant in CDKAL1 influences insulin response and risk of Type 2 Diabetes. Nat. Genet. 39:770‐775.
   Tuzun, E., Sharp, A.J., Bailey, J.A., Kaul, R., Morrison, V.A., Pertz, L.M., Haugen, E., Hayden, H., Albertson, D., Pinkel, D., Olson, M.V., and Eichler, E.E. 2005. Fine‐scale structural variation of the human genome. Nat. Genet. 37:727‐732.
   Wang, J., Mullighan, C.G., Easton, J., Roberts, S., Heatley, S.L., Ma, J., Rusch, M.C., Chen, K., Harris, C.C., Ding, L., Holmfeldt, L., Payne‐Turner, D., Fan, X., Wei, L., Zhao, D., Obenauer, J.C., Naeve, C., Mardis, E.R., Wilson, R.K., Downing, J.R., and Zhang, J. 2011. CREST maps somatic structural variation in cancer genomes with base‐pair resolution. Nat. Methods 8:652‐654.
   Xi, R., Hadjipanayis, A.G., Luquette, L.J., Kim, T.M., Lee, E., Zhang, J., Johnson, M.D., Muzny, D.M., Wheeler, D.A., Gibbs, R.A., Kucherlapati, R., and Park, P.J. 2011. Copy number variation detection in whole‐genome sequencing data using the Bayesian information criterion. Proc. Natl. Acad. Sci. U.S.A. 108:E1128‐E1136.
   Xie, C. and Tammi, M.T. 2009. CNV‐Seq, a new method to detect copy number variation using high‐throughput sequencing. BMC Bioinformatics 10:80.
   Ye, K., Schulz, M.H., Long, Q., Apweiler, R., and Ning, Z. 2009. Pindel: A pattern growth approach to detect break points of large deletions and medium sized insertions from paired‐end short reads. Bioinformatics 25:2865‐2871.
   Yoon, S., Xuan, Z., Makarov, V., Ye, K., and Sebat, J. 2009. Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 19:1586‐1592.
   Yoshihara, K., Tajima, A., Adachi, S., Quan, J., Sekine, M., Kase, H., Yahata, T., Inoue, I., and Tanaka, K. 2011. Germline copy number variations in BRCA1‐associated ovarian cancer patients. Genes Chromosomes Cancer 50:167‐177.
   Zeitouni, B., Boeva, V., Janoueix‐Lerosey, I., Loeillet, S., Legoix‐né, P., Nicolas, A., Delattre, O. and Barillot, E. 2010. SVDetect: A tool to identify genomic structural variations from paired‐end and mate‐pair sequencing data. Bioinformatics 26:1895‐1896.
   Zerbino, D.R. and Birney, E. 2008. Velvet: Algorithms for de novo short read assembly using de bruijn graphs. Genome Res. 18:821‐829.
   Zhang, J. and Wu, Y. 2011. SVseq: An approach for detecting exact breakpoints of deletions with low‐coverage sequence data. Bioinformatics 27:3228‐3234.
   Zhang, Z.D., Du, J., Lam, H., Abyzov, A., Urban, A.E., Snyder, M., and Gerstein, M. 2011. Identification of genomic indels and structural variations using split reads. BMC Genomics 12:375.
   Zhao, X., Weir, B.A., LaFramboise, T., Lin, M., Beroukhim, R., Garraway, L., Beheshti, J., Lee, J.C., Naoki, K., Richards, W.G., Sugarbaker, D., Chen, F., Rubin, M.A., Janne, P.A., Girard, L., Minna, J., Christiani, D., Li, C., Sellers, W.R., and Meyerson, M. 2005. Homozygous deletions and chromosome amplifications in human lung carcinomas revealed by single nucleotide polymorphism array analysis. Cancer Res. 65:5561‐5570.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library