In Silico Functional Annotation of Genomic Variation

Mariusz Butkiewicz1, William S. Bush1

1 Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio
Publication Name:  Current Protocols in Human Genetics
Unit Number:  Unit 6.15
DOI:  10.1002/0471142905.hg0615s88
Online Posting Date:  January, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


This unit describes the concepts and practical techniques for annotating genomic variants in the human genome to estimate their functional significance. With the rapid increase of available whole exome and whole genome sequencing information for human studies, annotation techniques have become progressively more important for highlighting and prioritizing nucleotide variants and their potential impact on genes and other genetic constructs. Here, we present an overview of different types of variant annotation approaches and elaborate on their foundations, assumptions, and the downstream consequences of their use. Computational approaches and tools to assign annotations and to identify variants are reviewed. Further, the general philosophy of assigning potential function to a genetic change within the biological context of a disease is discussed. © 2016 by John Wiley & Sons, Inc.

Keywords: genome annotation; functional prediction; sequence analysis; variant annotation

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Goals of Annotation
  • Algorithmic Approaches to Genomic Annotation
  • Non‐Algorithmic Resources For Genomic Annotation
  • Summary
  • Acknowledgments
  • Literature Cited
  • Tables
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
  Adzhubei, I., Jordan, D. M., and Sunyaev, S. R. 2013. Predicting functional effect of human missense mutations using PolyPhen‐2. Curr. Protoc. Hum. Genet. 76: 7.20.1‐7.20.41. doi: 10.1002/0471142905.hg0720s76.
  Adzhubei, I.A., Schmidt, S., Peshkin, L., Ramensky, V.E., Gerasimova, A., Bork, P., Kondrashov, A.S., and Sunyaev, S.R. 2010. A method and server for predicting damaging missense mutations. Nat. Methods 7:248‐249. doi: 10.1038/nmeth0410-248.
  Ashley, E.A., Butte, A.J., Wheeler, M.T., Chen, R., Klein, T.E., Dewey, F.E., Dudley, J.T., Ormond, K.E., Pavlovic, A., Morgan, A.A., Pushkarev, D., Neff, N.F., Hudgins, L., Gong, L., Hodges, L.M., Berlin, D.S., Thorn, C.F., Sangkuhl, K., Hebert, J.M., Woon, M., Sagreiya, H., Whaley, R., Knowles, J.W., Chou, M.F., Thakuria, J.V., Rosenbaum, A.M., Zaranek, A.W., Church, G.M., Greely, H.T., Quake, S.R., and Altman, R.B. 2010. Clinical assessment incorporating a personal genome. The Lancet 375:1525‐1535. doi: 10.1016/S0140-6736(10)60452-7.
  Bansal, V., Libiger, O., Torkamani, A., and Schork, N.J. 2010. Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11:773‐785. doi: 10.1038/nrg2867.
  Baurley, J.W. and Conti, D.V. 2013. A scalable, knowledge-based analysis framework for genetic association studies. BMC Bioinformatics. 14:312. doi: 10.1186/1471-2105-14-312. PMID: 24152222
  Baxevanis, A.D. 2012. Searching Online mendelian inheritance in man (OMIM) for information on genetic loci involved in human disease. Curr. Protoc. Hum. Genet. 73: 9.13.1‐9.13.10. doi: 10.1002/0471142905.hg0913s73.
  Bernstein, B.E., Stamatoyannopoulos, J.A., Costello, J.F., Ren, B., Milosavljevic, A., Meissner, A., Kellis, M., Marra, M.A., Beaudet, A.L., Ecker, J.R., Farnham, P.J., Hirst, M., Lander, E.S., Mikkelsen, T.S., and Thomson, J.A. 2010. The NIH roadmap epigenomics mapping consortium. Nat. Biotechnol. 28:1045‐1048. doi: 10.1038/nbt1010-1045.
  Buenrostro, J.D., Wu, B., Chang, H.Y., and Greenleaf, W.J. 2015. ATAC‐seq: A Method for assaying chromatin accessibility genome‐wide. Curr. Protoc. Mol. Biol. 109:21.29.1‐21.29.9. doi: 10.1002/0471142727.mb2129s109.
  Bush, W.S. and Moore, J.H. 2012. Chapter 11: Genome‐wide association studies. PLoS Comput. Biol. 8:e1002822. doi: 10.1371/journal.pcbi.1002822.
  Bush, W.S., Dudek, S.M., and Ritchie, M.D. 2009. Biofilter: A knowledge‐integration system for the multi‐locus analysis of genome‐wide association studies. Pac. Symp. Biocomput. 2009:368‐379.
  Caillaud, C., Lyonnet, S., Rey, F., Melle, D., Frebourg, T., Berthelon, M., Vilarinho, L., Osorio, R.V., Rey, J., and Munnich, A. 1991. A 3‐base pair in‐frame deletion of the phenylalanine hydroxylase gene results in a kinetic variant of phenylketonuria. J. Biol. Chem. 266:9351‐9354.
  Castellana, S. and Mazza, T. 2013. Congruency in the prediction of pathogenic missense mutations: State‐of‐the‐art web‐based tools. Brief. Bioinform. 14:448‐59. doi: 10.1093/bib/bbt013.
  Chatterjee, S. and Pal, J.K. 2009. Role of 5′‐and 3′‐untranslated regions of mRNAs in human diseases. Biol. Cell 101:251‐262. doi: 10.1042/BC20080104.
  Collins, F.S., Brooks, L.D., and Chakravarti, A. 1998. A DNA Polymorphism discovery resource for research on human genetic variation. Genome Res. 8:1229‐1231.
  Corcoran, D.L., Pandit, K.V., Gordon, B., Bhattacharjee, A., Kaminski, N., and Benos, P.V. 2009. Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data. PloS One 4:e5279. doi: 10.1371/journal.pone.0005279.
  Cutting, G.R. 2014. Annotating DNA Variants Is the Next Major Goal for Human Genetics. Am. J. Hum. Genet. 94:5‐10. doi: 10.1016/j.ajhg.2013.12.008.
  Dahl, J.A. and Collas, P. 2008. A rapid micro chromatin immunoprecipitation assay (ChIP). Nat. Protoc. 3:1032‐1045. doi: 10.1038/nprot.2008.68.
  Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., McVean, G., Durbin, R., and 1000 Genomes Project Analysis Group. 2011. The variant call format and VCFtools. Bioinformatics 27:2156‐2158. doi: 10.1093/bioinformatics/btr330.
  Das, P.M., Ramachandran, K., vanWert, J., and Singal, R. 2004. Chromatin immunoprecipitation assay. Biotechniques 37:961‐969.
  de Hoon, M., Shin, J.W., and Carninci, P. 2015. Paradigm shifts in genomics through the FANTOM projects. Mamm. Genome 1‐12.
  de Wit, E. and de Laat, W. 2012. A decade of 3C technologies: Insights into nuclear organization. Genes Dev. 26:11‐24. doi: 10.1101/gad.179804.111.
  Dekker, J. 2006. The three “C” s of chromosome conformation capture: Controls, controls, controls. Nat. Methods 3:17‐21. doi: 10.1038/nmeth823.
  Dekker, J., Rippe, K., Dekker, M., and Kleckner, N. 2002. Capturing chromosome conformation. Science 295:1306‐1311. doi: 10.1126/science.1067799.
  Delague, V., Bareil, C., Bouvagnet, P., Salem, N., Chouery, E., Loiselet, J., Mégarbané, A., and Claustres, M. 2002. A new autosomal recessive non‐progressive congenital cerebellar ataxia associated with mental retardation, optic atrophy, and skin abnormalities (CAMOS) maps to chromosome 15q24‐q26 in a large consanguineous Lebanese Druze Family. Neurogenetics 4:23‐27. doi: 10.1007/s10048-001-0127-z.
  Derrien, T., Johnson, R., Bussotti, G., Tanzer, A., Djebali, S., Tilgner, H., Guernec, G., Martin, D., Merkel, A., Knowles, D.G., Lagarde, J., Veeravalli, L., Ruan, X., Ruan, Y., Lassmann, T., Carninci, P., Brown, J.B., Lipovich, L., Gonzalez, J.M., Thomas, M., Davis, C.A., Shiekhattar, R., Gingeras, T.R., Hubbard, T.J., Notredame, C., Harrow, J., and Guigó, R. 2012. The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res. 22:1775‐1789. doi: 10.1101/gr.132159.111.
  Dostie, J. and Dekker, J. 2007. Mapping networks of physical interactions between genomic elements using 5C technology. Nat. Protoc. 2:988‐1002. doi: 10.1038/nprot.2007.116.
  ENCODE Consortium 2004. The ENCODE (ENCyclopedia of DNA elements) project. Science 306:636‐640. doi: 10.1126/science.1105136.
  Eskesen, S.T., Eskesen, F.N., and Ruvinsky, A. 2004. Natural selection affects frequencies of AG and GT dinucleotides at the 5′ and 3′ ends of exons. Genetics 167:543‐550. doi: 10.1534/genetics.167.1.543.
  Fairbrother, W.G., Yeh, R.‐F., Sharp, P.A., and Burge, C.B. 2002. Predictive identification of exonic splicing enhancers in human genes. Science 297:1007‐1013. doi: 10.1126/science.1073774.
  FANTOM Consortium and the RIKEN PMI and CLST (DGT), Forrest, A.R., Kawaji, H., Rehli, M., Baillie, J.K., de Hoon, M.J., Haberle, V., Lassmann, T., Kulakovskiy, I.V., Lizio, M., Itoh, M., Andersson, R., Mungall, C.J., Meehan, T.F., Schmeier, S, Bertin, N., Jørgensen, M., Dimont, E., Arner, E., Schmidl, C., Schaefer, U., Medvedeva, Y.A., Plessy, C., Vitezic, M., Severin, J., Semple, C., Ishizu, Y., Young, R.S., Francescatto, M., Alam, I., Albanese, D., Altschuler, G.M., Arakawa, T., Archer, J.A., Arner, P., Babina, M., Rennie, S., Balwierz, P.J., Beckhouse, A.G., Pradhan-Bhatt, P.J., Beckhouse, A.G., Pradhan-Bhatt, S., Blake, J.A., Blumenthal, A., Bodega, B., Bonetti, A., Briggs, J., Brombacher, F., Burroughs, A.M., Califano, A, Cannistraci, C.V., Carbajo, D., Chen, Y., Chierici, M., Ciani, Y., Clevers, H.C., Dalla, E., Davis, C.A., Detmar, M, Diehl, A.D., Dohi, T., Drabløs, F., Edge, A.S., Edinger, M., Ekwall, K., Endoh, M., Enomoto, H., Fagiolini, M., Fairbairn, L., Fang, H., Farach-Carson, M.C., Faulkner, G.J., Favorov, A.V., Fisher, M.E., Frith, M.C., Fujita, R., Fukuda, S., Furlanello, C., Furino, M., Furusawa, J., Geijtenbeek, T.B., Gibson, A.P., Gingeras, T., Goldowitz, D., Gough, J., Guhl, S., Guler, R., Gustincich, S., Ha, T.J., Hamaguchi, M., Hara, M., Harbers, M., Harshbarger, J., Hasegawa, A., Hasegawa, Y., Hashimoto, T., Herlyn, M., Hitchens, K.J., Ho Sui, S.J., Hofmann, O.M., Hoof, I., Hori, F., Huminiecki, L., Iida, K., Ikawa, T., Jankovic, B.R., Jia, H., Joshi, A., Jurman, G., Kaczkowski, B., Kai, C., Kaida, K., Kaiho, A., Kajiyama, K., Kanamori-Katayama, M., Kasianov, A.S., Kasukawa, T., Katayama, S., Kato, S., Kawaguchi, S., Kawamoto, H., Kawamura, YI., Kawashima, T., Kempfle, J.S., Kenna, T.J., Kere, J., Khachigian, LM., Kitamura, T., Klinken, S.P., Knox, A.J., Kojima, M., Kojima, S., Kondo, N., Koseki, H., Koyasu, S., Krampitz, S., Kubosaki, A., Kwon, A.T., Laros, J.F., Lee, W., Lennartsson, A., Li, K., Lilje, B., Lipovich, L., Mackay-Sim, A., Manabe, R., Mar, J.C., Marchand, B., Mathelier, A., Mejhert, N., Meynert, A., Mizuno, Y., de Lima Morais, DA., Morikawa, H., Morimoto, M., Moro, K., Motakis, E., Motohashi, H., Mummery, C.L., Murata, M., Nagao-Sato, S., Nakachi, Y., Nakahara, F., Nakamura, T., Nakamura, Y., Nakazato, K., van Nimwegen, E., Ninomiya, N., Nishiyori, H., Noma, S., Noma, S., Noazaki, T., Ogishima, S., Ohkura, N., Ohimiya, H., Ohno, H., Ohshima, M., Okada-Hatakeyama, M., Okazaki, Y., Orlando, V., Ovchinnikov, D.A., Pain, A., Passier, R., Patrikakis, M., Persson, H., Piazza, S., Prendergast, JG., Rackham, O.J., Ramilowski, J.A., Rashid, M., Ravasi, T., Rizzu, P., Roncador, M., Roy, S., Rye, M.B., Saijyo, E., Sajantila, A., Saka, A., Sakaguchi, S., Sakai, M., Sato, H., Savvi, S., Saxena, A., Schneider, C., Schultes, E.A., Schulze-Tanzil, G.G., Schwegmann, A., Sengstag, T., Sheng, G., Shimoji, H., Shimoni, Y., Shin, J.W., Simon, C., Sugiyama, D., Sugiyama, T., Suzuki, M., Suzuki, N., Swoboda, R.K., 't Hoen, P.A., Tagami, M., Takahashi, N., Takai, J., Tanaka, H., Tatsukawa, H., Tatum, Z., Thompson, M., Toyodo, H., Toyoda, T., Valen, E., van de Wetering, M., van den Berg, L.M., Verado, R., Vijayan, D., Vorontsov, I.E., Wasserman, W.W., Watanabe, S., Wells, C.A., Winteringham, L.N., Wolvetang, E., Wood, E.J., Yamaguchi, Y., Yamamoto, M., Yoneda, M., Yonekura, Y., Yoshida, S., Zabierowski, SE., Zhang, P.G., Zhao, X., Zucchelli, S., Summers, K.M., Suzuki, H., Daub, C.O., Kawai, J., Heutink, P., Hide, W., Freeman, T.C., Lenhard, B., Bajic, V.B., Taylor, M.S., Makeev, V.J., Sandelin, A., Hume, D.A., Carninci, P., Hayashizaki, Y. 2014. A promoter-level mammalian expression atlas. Nature 507, 462‐470. doi: 10.1038/nature13182.
  Flicek, P., Ahmed, I., Amode, M.R., Barrell, D., Beal, K., Brent, S., Carvalho‐Silva, D., Clapham, P., Coates, G., Fairley, S., Fitzgerald, S., Gil, L., García‐Girón, C., Gordon, L., Hourlier, T., Hunt, S., Juettemann, T., Kähäri, A.K., Keenan, S., Komorowska, M., Kulesha, E., Longden, I., Maurel, T., McLaren, W.M., Muffato, M., Nag, R., Overduin, B., Pignatelli, M., Pritchard, B., Pritchard, E., Riat, H.S., Ritchie, G.R., Ruffier, M., Schuster, M., Sheppard, D., Sobral, D., Taylor, K., Thormann, A., Trevanion, S., White, S., Wilder, S.P., Aken, B.L., Birney, E., Cunningham, F., Dunham, I., Harrow, J., Herrero, J., Hubbard, T.J., Johnson, N., Kinsella, R., Parker, A., Spudich, G., Yates, A., Zadissa, A., and Searle, S.M. 2012. Ensembl 2013. Nucleic Acids Res. 41:D48‐55. doi: 10.1093/nar/gks1236.
  Frayling, T.M., Timpson, N.J., Weedon, M.N., Zeggini, E., Freathy, R.M., Lindgren, C.M., Perry, J.R., Elliott, K.S., Lango, H., Rayner, N.W., Shields, B., Harries, L.W., Barrett, J.C., Ellard, S., Groves, C.J., Knight, B., Patch, A.M., Ness, A.R., Ebrahim, S., Lawlor, D.A., Ring, S.M., Ben‐Shlomo, Y., Jarvelin, M.R., Sovio, U., Bennett, A.J., Melzer, D., Ferrucci, L., Loos, R.J., Barroso, I., Wareham, N.J., Karpe, F., Owen, K.R., Cardon, L.R., Walker, M., Hitman, G.A., Palmer, C.N., Doney, A.S., Morris, A.D., Smith, G.D., Hattersley, A.T., and McCarthy, M.I. 2007. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316:889‐894. doi: 10.1126/science.1141634.
  Freudenberg‐Hua, Y., Freudenberg, J., Vacic, V., Abhyankar, A., Emde, A.‐K., Ben‐Avraham, D., Barzilai, N., Oschwald, D., Christen, E., Koppel, J., Greenwald, B., Darnell, R.B., Germer, S., Atzmon, G., and Davies, P. 2014. Disease variants in genomes of 44 centenarians. Mol. Genet. Genomic Med. 2:438‐450. doi: 10.1002/mgg3.86.
  Giresi, P.G., Kim, J., McDaniell, R.M., Iyer, V.R., and Lieb, J.D. 2007. FAIRE (Formaldehyde‐Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin. Genome Res. 17:877‐885. doi: 10.1101/gr.5533506.
  Gui, J., Greene, C.S., Sullivan, C., Taylor, W., Moore, J.H., and Kim, C. 2015. Testing multiple hypotheses through IMP weighted FDR based on a genetic functional network with application to a new zebrafish transcriptome study. BioData Min. 8:17. doi: 10.1186/s13040-015-0050-8.
  Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., and McKusick, V.A. 2005. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33:D514‐D517. doi: 10.1093/nar/gki033.
  Hancock, J.M., Hancock, J.M., and Zvelebil, M.J. 2004. HAVANA (Human and Vertebrate Analysis and Annotation). In Dictionary of Bioinformatics and Computational Biology, (John Wiley & Sons, Ltd).
  Harrow, J., Frankish, A., Gonzalez, J.M., Tapanari, E., Diekhans, M., Kokocinski, F., Aken, B.L., Barrell, D., Zadissa, A., Searle, S., Barnes, I., Bignell, A., Boychenko, V., Hunt, T., Kay, M., Mukherjee, G., Rajan, J., Despacio‐Reyes, G., Saunders, G., Steward, C., Harte, R., Lin, M., Howald, C., Tanzer, A., Derrien, T., Chrast, J., Walters, N., Balasubramanian, S., Pei, B., Tress, M., Rodriguez, J.M., Ezkurdia, I., van Baren, J., Brent, M., Haussler, D., Kellis, M., Valencia, A., Reymond, A., Gerstein, M., Guigó R, and Hubbard, T.J. 2012. GENCODE: The reference human genome annotation for The ENCODE Project. Genome Res. 22:1760‐1774. doi: 10.1101/gr.135350.111.
  He, H.H., Meyer, C.A., Long, H., Liu, X.S., and Brown, M. 2013. A closer look into DNase I hypersensitivity. Epigenetics Chromatin 6:1‐1. doi: 10.1186/1756-8935-6-S1-P25.
  He, H.H., Meyer, C.A., Chen, M.W., Jordan, V.C., Brown, M., and Liu, X.S. 2012. Differential DNase I hypersensitivity reveals factor‐dependent chromatin dynamics. Genome Res. 22:1015‐1025. doi: 10.1101/gr.133280.111.
  Herold, C., Steffens, M., Brockschmidt, F.F., Baur, M.P., and Becker, T. 2009. INTERSNP: Genome‐wide interaction analysis guided by a priori information. Bioinformatics 25:3275‐3281. doi: 10.1093/bioinformatics/btp596.
  Hoffman, M.M., Buske, O.J., Wang, J., Weng, Z., Bilmes, J.A., and Noble, W.S. 2012a. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods 9:473‐476. doi: 10.1038/nmeth.1937.
  Hoffman, M.M., Ernst, J., Wilder, S.P., Kundaje, A., Harris, R.S., Libbrecht, M., Giardine, B., Ellenbogen, P.M., Bilmes, J.A., Birney, E., Hardison, R.C., Dunham, I., Kellis, M., and Noble, W.S. 2012b. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids. Res. 41:827‐41. doi: 10.1093/nar/gks1284.
  Johnson, R.C., Nelson, G.W., Troyer, J.L., Lautenberger, J.A., Kessing, B.D., Winkler, C.A., and O'Brien, S.J. 2010. Accounting for multiple comparisons in a genome‐wide association study (GWAS). BMC Genomics 11:724. doi: 10.1186/1471-2164-11-724.
  Kanamori‐Katayama, M., Itoh, M., Kawaji, H., Lassmann, T., Katayama, S., Kojima, M., Bertin, N., Kaiho, A., Ninomiya, N., Daub, C.O., Carninci, P., Forrest, A.R., and Hayashizaki, Y. 2011. Unamplified cap analysis of gene expression on a single‐molecule sequencer. Genome Res. 21:1150‐1159. doi: 10.1101/gr.115469.110.
  Kircher, M., Witten, D.M., Jain, P., O'Roak, B.J., Cooper, G.M., and Shendure, J. 2014. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46:310‐315. doi: 10.1038/ng.2892.
  Korte, A. and Farlow, A. 2013. The advantages and limitations of trait analysis with GWAS: A review. Plant Methods 9:29. doi: 10.1186/1746-4811-9-29.
  Kumar, P., Henikoff, S., and Ng, P.C. 2009. Predicting the effects of coding non‐synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4:1073‐1081. doi: 10.1038/nprot.2009.86.
  Landrum, M.J., Lee, J.M., Riley, G.R., Jang, W., Rubinstein, W.S., Church, D.M., and Maglott, D.R. 2014. ClinVar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42:D980‐D985. doi: 10.1093/nar/gkt1113.
  Li, G., Fullwood, M., Xu, H., Mulawadi, F., Velkov, S., Vega, V., Ariyaratne, P., Mohamed, Y., Ooi, H., Tennakoon, C., Wei, C.L., Ruan, Y., and Sung, W.K. 2010. Software ChIA‐PET tool for comprehensive chromatin interaction analysis with paired‐end tag sequencing. Genome Biol 11:R22. doi: 10.1186/gb-2010-11-2-r22.
  MacArthur, D.G., Manolio, T.A., Dimmock, D.P., Rehm, H.L., Shendure, J., Abecasis, G.R., Adams, D.R., Altman, R.B., Antonarakis, S.E., Ashley, E.A., Barrett, J.C., Biesecker, L.G., Conrad, D.F., Cooper, G.M., Cox, N.J., Daly, M.J., Gerstein, M.B., Goldstein, D.B., Hirschhorn, J.N., Leal, S.M., Pennacchio, L.A., Stamatoyannopoulos, J.A., Sunyaev, S.R., Valle, D., Voight, B.F., Winckler, W., and Gunter, C. 2014. Guidelines for investigating causality of sequence variants in human disease. Nature 508:469‐476. doi: 10.1038/nature13127.
  Madrigal, P. and Krajewski, P. 2012. Current bioinformatic approaches to identify DNase I hypersensitive sites and genomic footprints from DNase‐seq data. Front. Genet. 3:230. doi: 10.3389/fgene.2012.00230.
  Maquat, L.E. 2004. Nonsense‐mediated mRNA decay: Splicing, translation and mRNP dynamics. Nat. Rev. Mol. Cell Biol. 5:89‐99. doi: 10.1038/nrm1310.
  Marx, V. 2013. Biology: The big challenges of big data. Nature 498:255‐260. doi: 10.1038/498255a.
  Matera, A.G. and Wang, Z. 2014. A day in the life of the spliceosome. Nat. Rev. Mol. Cell Biol. 15:108‐121. doi: 10.1038/nrm3742.
  McCarthy, D.J., Humburg, P., Kanapin, A., Rivas, M.A., Gaulton, K., The WGS500 Consortium, Cazier, J.‐B., and Donnelly, P. 2014. Choice of transcripts and software has a large effect on variant annotation. Genome Med. 6:1‐16. doi: 10.1186/gm543.
  Metzker, M.L. 2010. Sequencing technologies—the next generation. Nat. Rev. Genet. 11:31‐46. doi: 10.1038/nrg2626.
  Mungall, C.J., Batchelor, C., and Eilbeck, K. 2011. Evolution of the sequence ontology terms and relationships. J. Biomed. Inform. 44:87‐93. doi: 10.1016/j.jbi.2010.03.002.
  Neale, B.M., Rivas, M.A., Voight, B.F., Altshuler, D., Devlin, B., Orho‐Melander, M., Kathiresan, S., Purcell, S.M., Roeder, K., and Daly, M.J. 2011. Testing for an unusual distribution of rare variants. PLoS Genet 7:e1001322. doi: 10.1371/journal.pgen.1001322.
  Nicolas, E., Poitelon, Y., Chouery, E., Salem, N., Levy, N., Mégarbané, A., and Delague, V. 2010. CAMOS, a nonprogressive, autosomal recessive, congenital cerebellar ataxia, is caused by a mutant zinc‐finger protein, ZNF592. Eur. J. Hum. Genet. 18:1107‐1113. doi: 10.1038/ejhg.2010.82.
  Pabinger, S., Dander, A., Fischer, M., Snajder, R., Sperk, M., Efremova, M., Krabichler, B., Speicher, M.R., Zschocke, J., and Trajanoski, Z. 2014. A survey of tools for variant analysis of next‐generation genome sequencing data. Brief. Bioinform. 15:256‐278. doi: 10.1093/bib/bbs086.
  Parsons, M.T., Whiley, P.J., Beesley, J., Drost, M., de Wind, N., Thompson, B.A., Marquart, L., Hopper, J.L., Jenkins, M.A., Brown, M.A., Tucker, K., Warwick, L., Buchanan, D.D., and Spurdle, A.B. 2015. Consequences of germline variation disrupting the constitutional translational initiation codon start sites of MLH1 and BRCA2: Use of potential alternative start sites and implications for predicting variant pathogenicity. Mol. Carcinog. 54:513‐522. doi: 10.1002/mc.22116.
  Pendergrass, S.A., Frase, A.T., Wallace, J.R., Wolfe, D., Katiyar, N., Moore, C., and Ritchie, M.D. 2013. Genomic analyses with biofilter 2.0: Knowledge driven filtering, annotation, and model development. BioData Min. 6:25. doi: 10.1186/1756-0381-6-25.
  Peterson, T.A., Doughty, E., and Kann, M.G. 2013. Towards precision medicine: Advances in computational approaches for the analysis of human variants. J. Mol. Biol. 425:4047‐4063. doi: 10.1016/j.jmb.2013.08.008.
  Pruitt, K.D., Tatusova, T., and Maglott, D.R. 2007. NCBI reference sequences (RefSeq): A curated non‐redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35:D61‐D65. doi: 10.1093/nar/gkl842.
  Ritchie, G.R. and Flicek, P. 2014. Computational approaches to interpreting genomic sequence variation. Genome Med. 6:87. doi: 10.1186/s13073-014-0087-1.
  Rockah‐Shmuel, L., Tóth‐Petróczy, Á., Sela, A., Wurtzel, O., Sorek, R., and Tawfik, D.S. 2013. Correlated occurrence and bypass of frame‐shifting insertion‐deletions (InDels) to give functional proteins. PLoS Genet 9:e1003882. doi: 10.1371/journal.pgen.1003882.
  Sanger, F., Nicklen, S., and Coulson, A.R. 1977. DNA sequencing with chain‐terminating inhibitors. Proc. Natl. Acad. Sci. 74:5463‐5467. doi: 10.1073/pnas.74.12.5463.
  Smemo, S., Tena, J.J., Kim, K.‐H., Gamazon, E.R., Sakabe, N.J., Gómez‐Marín, C., Aneas, I., Credidio, F.L., Sobreira, D.R., Wasserman, N.F., Lee, J.H., Puviindran, V., Tam, D., Shen, M., Son, J.E., Vakili, N.A., Sung, H.K., Naranjo, S., Acemel, R.D., Manzanares, M., Nagy, A., Cox, N.J., Hui, C.C., Gomez‐Skarmeta, J.L., and Nóbrega, M.A. 2014. Obesity‐associated variants within FTO form long‐range functional connections with IRX3. Nature 507:371‐375. doi: 10.1038/nature13138.
  Stenson, P.D., Ball, E.V., Mort, M., Phillips, A.D., Shiel, J.A., Thomas, N.S., Abeysinghe, S., Krawczak, M., and Cooper, D.N. 2003. Human gene mutation database (HGMD®): 2003 update. Hum. Mutat. 21:577‐581. doi: 10.1002/humu.10212.
  Tian, C., Gregersen, P.K., and Seldin, M.F. 2008. Accounting for ancestry: Population substructure and genome‐wide association studies. Hum. Mol. Genet. 17:R143‐R150. doi: 10.1093/hmg/ddn268.
  van Grunsven, E.G., van Berkel, E., Ijlst, L., Vreken, P., de Klerk, J.B., Adamski, J., Lemonde, H., Clayton, P.T., Cuebas, D.A., and Wanders, R.J. 1998. Peroxisomal D‐hydroxyacyl‐CoA dehydrogenase deficiency: Resolution of the enzyme defect and its molecular basis in bifunctional protein deficiency. Proc. Natl. Acad. Sci. U.S.A. 95:2128‐2133. doi: 10.1073/pnas.95.5.2128.
  van Grunsven, E.G., van Berkel, E., Mooijer, P.A., Watkins, P.A., Moser, H.W., Suzuki, Y., Jiang, L.L., Hashimoto, T., Hoefler, G., Adamski, J., and Wanders, R.J. 1999. Peroxisomal bifunctional protein deficiency revisited: Resolution of its true enzymatic and molecular basis. Am. J. Hum. Genet. 64:99‐107. doi: 10.1086/302180.
  Veltman, J.A. and Lupski, J.R. 2015. From genes to genomes in the clinic. Genome Med. 7:78. doi: 10.1186/s13073-015-0200-0.
  Wilming, L.G., Gilbert, J.G., Howe, K., Trevanion, S., Hubbard, T., and Harrow, J.L. 2008. The vertebrate genome annotation (Vega) database. Nucleic Acids Res. 36:D753‐D760. doi: 10.1093/nar/gkm987.
PDF or HTML at Wiley Online Library