Analysis of Heritability Using Genome‐Wide Data

Jacob B. Hall1, William S. Bush1

1 Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio
Publication Name:  Current Protocols in Human Genetics
Unit Number:  Unit 1.30
DOI:  10.1002/cphg.25
Online Posting Date:  October, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


Most analyses of genome‐wide association data consider each variant independently without considering or adjusting for the genetic background present in the rest of the genome. New approaches to genome analysis use representations of genomic sharing to better account for confounding factors like population stratification or to directly approximate heritability through the estimated sharing of individuals in a dataset. These approaches use mixed linear models, which relate genotypic sharing to phenotypic sharing, and rely on the efficient computation of genetic sharing among individuals in a dataset. This unit describes the principles and practical application of mixed models for the analysis of genome‐wide association study data. © 2016 by John Wiley & Sons, Inc.

Keywords: mixed‐model analysis; heritability; GCTA

PDF or HTML at Wiley Online Library

Table of Contents

  • Key Concepts
  • Estimating Genetic Relationship Matrices
  • Mixed‐Model Analyses
  • Estimating Trait Variance Explained Using GRMs
  • Discussion of Strategies
  • Summary
  • Acknowledgments
  • Literature Cited
  • Figures
  • Tables
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

Literature Cited
  Abecasis, G.R., Cherny, S.S., Cookson, W.O., and Cardon, L.R. 2001. GRR: Graphical representation of relationship errors. Bioinformatics 17:742‐743. doi: 10.1093/bioinformatics/17.8.742.
  Abecasis, G.R., Auton, A., Brooks, L.D., DePristo, M.A., Durbin, R.M., Handsaker, R.E., Kang, H.M., Marth, G.T., and McVean, G.A. 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491:56‐65. doi: 10.1038/nature11632.
  Barrett, J.C. and Cardon, L.R. 2006. Evaluating coverage of genome‐wide association studies. Nat. Genet. 38:659‐662. doi: 10.1038/ng1801.
  Boomsma, D., Busjahn, A., and Peltonen, L. 2002. Classical twin studies and beyond. Nat. Rev. Genet. 3:872‐882. doi: 10.1038/nrg932.
  Bush, W.S., Sawcer, S.J., de Jager, P.L., Oksenberg, J.R., McCauley, J.L., Pericak‐Vance, M.A., and Haines, J.L. 2010. Evidence for polygenic susceptibility to multiple sclerosis–the shape of things to come. Am. J. Hum. Genet. 86:621‐625. doi: 10.1016/j.ajhg.2010.02.027.
  Chang, C.C., Chow, C.C., Tellier, L.C., Vattikuti, S., Purcell, S.M., and Lee, J.J. 2015. Second‐generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience 4:7. doi: 10.1186/s13742‐015‐0047‐8.
  Davis, L.K., Yu, D., Keenan, C.L., Gamazon, E.R., Konkashbaev, A.I., Derks, E.M., Neale, B.M., Yang, J., Lee, S.H., Evans, P., Barr, C.L., Bellodi, L., Benarroch, F., Berrio, G.B., Bienvenu, O.J., Bloch, M.H., Blom, R.M., Bruun, R.D., Budman, C.L., Camarena, B., Campbell, D., Cappi, C., Cardona Silgado, J.C., Cath, D.C., Cavallini, M.C., Chavira, D.A., Chouinard, S., Conti, D.V., Cook, E.H., Coric, V., Cullen, B.A., Deforce, D., Delorme, R., Dion, Y., Edlund, C.K., Egberts, K., Falkai, P., Fernandez, T.V., Gallagher, P.J., Garrido, H., Geller, D., Girard, S.L., Grabe, H.J., Grados, M.A., Greenberg, B.D., Gross‐Tsur, V., Haddad, S., Heiman, G.A., Hemmings, S.M., Hounie, A.G., Illmann, C., Jankovic, J., Jenike, M.A., Kennedy, J.L., King, R.A., Kremeyer, B., Kurlan, R., Lanzagorta, N., Leboyer, M., Leckman, J.F., Lennertz, L., Liu, C., Lochner, C., Lowe, T.L., Macciardi, F., McCracken, J.T., McGrath, L.M., Mesa Restrepo, S.C., Moessner, R., Morgan, J., Muller, H., Murphy, D.L., Naarden, A.L., Ochoa, W.C., Ophoff, R.A., Osiecki, L., Pakstis, A.J., Pato, M.T., Pato, C.N., Piacentini, J., Pittenger, C., Pollak, Y., Rauch, S.L., Renner, T.J., Reus, V.I., Richter, M.A., Riddle, M.A., Robertson, M.M., Romero, R., Rosàrio, M.C., Rosenberg, D., Rouleau, G.A., Ruhrmann, S., Ruiz‐Linares, A., Sampaio, A.S., Samuels, J., Sandor, P., Sheppard, B., Singer, H.S., Smit, J.H., Stein, D.J., Strengman, E., Tischfield, J.A., Valencia Duarte, A.V., Vallada, H., Van Nieuwerburgh, F., Veenstra‐Vanderweele, J., Walitza, S., Wang, Y., Wendland, J.R., Westenberg, H.G., Shugart, Y.Y., Miguel, E.C., McMahon, W., Wagner, M., Nicolini, H., Posthuma, D., Hanna, G.L., Heutink, P., Denys, D., Arnold, P.D., Oostra, B.A., Nestadt, G., Freimer, N.B., Pauls, D.L., Wray, N.R., Stewart, S.E., Mathews, C.A., Knowles, J.A., Cox, N.J., and Scharf, J.M. 2013. Partitioning the heritability of Tourette syndrome and obsessive compulsive disorder reveals differences in genetic architecture. PLoS Genet. 9:e1003864. doi: 10.1371/journal.pgen.1003864.
  Dempster, E.R. and Lerner, I.M. 1950. Heritability of threshold characters. Genetics 35:212‐236.
  Duchateau, L., Janssen, P., and Rowlands, J. 1998. Linear mixed models. An introduction with applications in veterinary research. International Livestock Research Institute. Nairobi, Kenya.
  Dueker, N.D. and Pericak‐Vance, M.A. 2014. Analysis of genetic linkage data for mendelian traits. Curr. Protoc. Hum. Genet. 83:1.4.1‐1.4.31. doi: 10.1002/0471142905.hg0104s83.
  Edwards, T.L. and Gao, X. 2012. Methods for detecting and correcting for population stratification. Curr. Protoc. Hum. Genet. 73:1.22.1‐1.22.14. doi: 10.1002/0471142905.hg0122s73.
  Eu‐Ahsunthornwattana, J., Miller, E.N., Fakiola, M., Jeronimo, S.M.B., Blackwell, J.M., and Cordell, H.J. 2014. Comparison of methods to account for relatedness in genome‐wide association studies with family‐based data. PLoS Genet. 10:e1004445. doi: 10.1371/journal.pgen.1004445.
  Falconer, D.S. 1967. The inheritance of liability to diseases with variable age of onset, with particular reference to diabetes mellitus. Ann. Hum. Genet. 31:1‐20. doi: 10.1111/j.1469‐1809.1967.tb02015.x.
  Falconer, D.S., and Mackay, T.F. 1996. Introduction to Quantitative Genetics, 4th ed. Pearson. New York.
  Flegal, K.M., Carroll, M.D., Kit, B.K., and Ogden, C.L. 2012. Prevalence of obesity and trends in the distribution of body mass index among US adults, 1999‐2010. J. Am. Med. Assoc. 307:491‐497. doi: 10.1001/jama.2012.39.
  Fritsche, L.G., Chen, W., Schu, M., Yaspan, B.L., Yu, Y., Thorleifsson, G., Zack, D.J., Arakawa, S., Cipriani, V., Ripke, S., Igo RP Jr, Buitendijk, G.H., Sim, X., Weeks, D.E., Guymer, R.H., Merriam, J.E., Francis, P.J., Hannum, G., Agarwal, A., Armbrecht, A.M., Audo, I., Aung, T., Barile, G.R., Benchaboune, M., Bird, A.C., Bishop, P.N., Branham, K.E., Brooks, M., Brucker, A.J., Cade, W.H., Cain, M.S., Campochiaro, P.A., Chan, C.C., Cheng, C.Y., Chew, E.Y., Chin, K.A., Chowers, I., Clayton, D.G., Cojocaru, R., Conley, Y.P., Cornes, B.K., Daly, M.J., Dhillon, B., Edwards, A.O., Evangelou, E., Fagerness, J., Ferreyra, H.A., Friedman, J.S., Geirsdottir, A., George, R.J., Gieger, C., Gupta, N., Hagstrom, S.A., Harding, S.P., Haritoglou, C., Heckenlively, J.R., Holz, F.G., Hughes, G., Ioannidis, J.P., Ishibashi, T., Joseph, P., Jun, G., Kamatani, Y., Katsanis, N., Keilhauer, C.N., Khan, J.C., Kim, I.K., Kiyohara, Y., Klein, B.E., Klein, R., Kovach, J.L., Kozak, I., Lee, C.J., Lee, K.E., Lichtner, P., Lotery, A.J., Meitinger, T., Mitchell, P., Mohand‐Saïd, S., Moore, A.T., Morgan, D.J., Morrison, M.A., Myers, C.E., Naj, A.C., Nakamura, Y., Okada, Y., Orlin, A., Ortube, M.C., Othman, M.I., Pappas, C., Park, K.H., Pauer, G.J., Peachey, N.S., Poch, O., Priya, R.R., Reynolds, R., Richardson, A.J., Ripp, R., Rudolph, G., Ryu, E., Sahel, J.A., Schaumberg, D.A., Scholl, H.P., Schwartz, S.G., Scott, W.K., Shahid, H., Sigurdsson, H., Silvestri, G., Sivakumaran, T.A., Smith, R.T., Sobrin, L., Souied, E.H., Stambolian, D.E., Stefansson, H., Sturgill‐Short, G.M., Takahashi, A., Tosakulwong, N., Truitt, B.J., Tsironi, E.E., Uitterlinden, A.G., van Duijn, C.M., Vijaya, L., Vingerling, J.R., Vithana, E.N., Webster, A.R., Wichmann, H.E., Winkler, T.W., Wong, T.Y., Wright, A.F., Zelenika, D., Zhang, M., Zhao, L., Zhang, K., Klein, M.L., Hageman, G.S., Lathrop, G.M., Stefansson, K., Allikmets, R., Baird, P.N., Gorin, M.B., Wang, J.J., Klaver, C.C., Seddon, J.M., Pericak‐Vance, M.A., Iyengar, S.K., Yates, J.R., Swaroop, A., Weber, B.H., Kubo, M., Deangelis, M.M., Léveillard, T., Thorsteinsdottir, U., Haines, J.L., Farrer, L.A., Heid, I.M., Abecasis, G.R., and AMD Gene Consortium. 2013. Seven new loci associated with age‐related macular degeneration. Nat. Genet. 45:433‐439. doi: 10.1038/ng.2578.
  Green, B.F. and Tukey, J.W. 1960. Complex analyses of variance: General problems. Psychometrika 25:127‐152. doi: 10.1007/BF02288577.
  Hall, J.B., Cooke Bailey, J.N., Hoffman, J.D., Pericak‐Vance, M.A., Scott, W.K., Kovach, J.L., Schwartz, S.G., Agarwal, A., Brantley, M.A., Haines, J.L., and Bush, W.S. 2015. Estimating cumulative pathway effects on risk for age‐related macular degeneration using mixed linear models. BMC Bioinformatics 16:329. doi: 10.1186/s12859‐015‐0760‐4.
  Hancock, D.B. and Scott, W.K. 2012. Population‐based case‐control association studies. Curr. Protoc. Hum. Genet. 74:1.17.1‐1.17.20. doi: 10.1002/0471142905.hg0117s74.
  Hayeck, T.J., Zaitlen, N.A., Loh, P.‐R., Vilhjalmsson, B., Pollack, S., Gusev, A., Yang, J., Chen, G.‐B., Goddard, M.E., Visscher, P.M., Patterson, N., and Price, A.L. 2015. Mixed model with correction for case‐control ascertainment increases association power. Am. J. Hum. Genet. 96:720‐730. doi: 10.1016/j.ajhg.2015.03.004.
  Hayes, B.J., Visscher, P.M., and Goddard, M.E. 2009. Increased accuracy of artificial selection by using the realized relationship matrix. Genet. Res. 91:47‐60. doi: 10.1017/S0016672308009981.
  Hill, W.G., Goddard, M.E., and Visscher, P.M. 2008. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 4:e1000008. doi: 10.1371/journal.pgen.1000008.
  Kreft, I.G.G., Kreft, I., and de Leeuw, J. 1998. Introducing Multilevel Modeling. SAGE Publications. London, England.
  Lambert, G., Tsinajinnie, D., and Duggan, D. 2013. Single nucleotide polymorphism genotyping using BeadChip microarrays. Curr. Protoc. Hum. Genet. 78:2.9.1‐2.9.34. doi: 10.1002/0471142905.hg0209s78.
  LaMotte, L.R. 2014. Fixed‐, Random‐, and Mixed‐Effects Models. Wiley StatsRef: Statistics Reference Online.
  Lee, S.H., Wray, N.R., Goddard, M.E., and Visscher, P.M. 2011. Estimating missing heritability for disease from genome‐wide association studies. Am. J. Hum. Genet. 88:294‐305. doi: 10.1016/j.ajhg.2011.02.002.
  Legarra, A., Aguilar, I., and Misztal, I. 2009. A relationship matrix including full pedigree and genomic information. J. Dairy Sci. 92:4656‐4663. doi: 10.3168/jds.2009‐2061.
  Li, Y., Willer, C.J., Ding, J., Scheet, P., and Abecasis, G.R. 2010. MaCH: Using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34:816‐834. doi: 10.1002/gepi.20533.
  Liu, Y., Nyunoya, T., Leng, S., Belinsky, S.A., Tesfaigzi, Y., and Bruse, S. 2013. Softwares and methods for estimating genetic ancestry in human populations. Hum. Genomics 7:1. doi: 10.1186/1479‐7364‐7‐1.
  Malinowski, J., Goodloe, R., Brown‐Gentry, K., and Crawford, D.C. 2015. Cryptic relatedness in epidemiologic collections accessed for genetic association studies: Experiences from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study and the National Health and Nutrition Examination Surveys (NHANES). Front. Genet. 6:317. doi: 10.3389/fgene.2015.00317.
  Miclaus, K., Wolfinger, R., Vega, S., Chierici, M., Furlanello, C., Lambert, C., Hong, H., Zhang, L., Yin, S., and Goodsaid, F. 2010. Batch effects in the BRLMM genotype calling algorithm influence GWAS results for the Affymetrix 500K array. Pharmacogenomics J. 10:336‐346. doi: 10.1038/tpj.2010.36.
  Porcu, E., Sanna, S., Fuchsberger, C., and Fritsche, L.G. 2013. Genotype imputation in genome‐wide association studies. Curr. Protoc. Hum. Genet. 78:1.25.1‐1.25.14. doi: 10.1002/0471142905.hg0125s78.
  Price, A.L., Patterson, N.J., Plenge, R.M., Weinblatt, M.E., Shadick, N.A., and Reich, D. 2006. Principal components analysis corrects for stratification in genome‐wide association studies. Nat. Genet. 38:904‐909. doi: 10.1038/ng1847.
  Purcell, S.M., Wray, N.R., Stone, J.L., Visscher, P.M., O'Donovan, M.C., Sullivan, P.F., and Sklar, P. 2009. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460:748‐752. doi: 10.1038/nature08185.
  Purcell, S., Neale, B., Todd‐Brown, K., Thomas, L., Ferreira, M.A.R., Bender, D., Maller, J., Sklar, P., de Bakker, P.I.W., Daly, M.J., and Sham, P.C. 2007. PLINK: A tool set for whole‐genome association and population‐based linkage analyses. Am. J. Hum. Genet. 81:559‐575. doi: 10.1086/519795.
  Rahmioğlu, N. and Ahmadi, K.R. 2010. Classical twin design in modern pharmacogenomics studies. Pharmacogenomics 11:215‐226. doi: 10.2217/pgs.09.171.
  Robinson, G.K. 1991. That BLUP is a good thing: The estimation of random effects. Stat. Sci. 6:15‐32. doi: 10.1214/ss/1177011926.
  Searle, S.R., Casella, G., and McCulloch, C.E. 1992. Variance Components. John Wiley & Sons. Hoboken, N.J.
  Snijders, T.A.B. and Bosker, R.J. 1999. Introduction to Multilevel Analysis. SAGE Publications. London, England.
  Speed, D., Hemani, G., Johnson, M.R., and Balding, D.J. 2012. Improved heritability estimation from genome‐wide SNPs. Am. J. Hum. Genet. 91:1011‐1021. doi: 10.1016/j.ajhg.2012.10.010.
  Stevens, E.L., Baugher, J.D., Shirley, M.D., Frelin, L.P., and Pevsner, J. 2012. Unexpected relationships and inbreeding in HapMap phase III populations. PLoS One 7:e49575. doi: 10.1371/journal.pone.0049575.
  Turner, S., Armstrong, L.L., Bradford, Y., Carlson, C.S., Crawford, D.C., Crenshaw, A.T., de Andrade, M., Doheny, K.F., Haines, J.L., Hayes, G., Jarvik, G., Jiang, L., Kullo, I.J., Li, R., Ling, H., Manolio, T.A., Matsumoto, M., McCarty, C.A., McDavid, A.N., Mirel, D.B., Paschall, J.E., Pugh, E.W., Rasmussen, L.V., Wilke, R.A., Zuvich, R.L., and Ritchie, M.D. 2011. Quality control procedures for genome‐wide association studies. Curr. Protoc. Hum. Genet. 68:1.19.1‐1.19.18. doi: 10.1002/0471142905.hg0119s68.
  Verma, S.S., de Andrade, M., Tromp, G., Kuivaniemi, H., Pugh, E., Namjou‐Khales, B., Mukherjee, S., Jarvik, G.P., Kottyan, L.C., Burt, A., Bradford, Y., Armstrong, G.D., Derr, K., Crawford, D.C., Haines, J.L., Li, R., Crosslin, D., and Ritchie, M.D. 2014. Imputation and quality control steps for combining multiple genome‐wide datasets. Front. Genet. 5:370. doi: 10.3389/fgene.2014.00370.
  Welter, D., MacArthur, J., Morales, J., Burdett, T., Hall, P., Junkins, H., Klemm, A., Flicek, P., Manolio, T., Hindorff, L., and Parkinson, H. 2014. The NHGRI GWAS Catalog, a curated resource of SNP‐trait associations. Nucl. Acids Res. 42:D1001‐D1006. doi: 10.1093/nar/gkt1229.
  Yang, J., Lee, S.H., Goddard, M.E., and Visscher, P.M. 2011a. GCTA: A tool for genome‐wide complex trait analysis. Am. J. Hum. Genet. 88:76‐82. doi: 10.1016/j.ajhg.2010.11.011.
  Yang, J., Manolio, T.A., Pasquale, L.R., Boerwinkle, E., Caporaso, N., Cunningham, J.M., de Andrade, M., Feenstra, B., Feingold, E., Hayes, M.G., Hill, W.G., Landi, M.T., Alonso, A., Lettre, G., Lin, P., Ling, H., Lowe, W., Mathias, R.A., Melbye, M., Pugh, E., Cornelis, M.C., Weir, B.S., Goddard, M.E., and Visscher, P.M. 2011b. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43:519‐525. doi: 10.1038/ng.823.
  Yang, J., Zaitlen, N.A., Goddard, M.E., Visscher, P.M., and Price, A.L. 2014. Advantages and pitfalls in the application of mixed‐model association methods. Nat. Genet. 46:100‐106. doi: 10.1038/ng.2876.
  Yaspan, B.L. and Veatch, O.J. 2011. Strategies for pathway analysis from GWAS data. Curr Protoc. Hum. Genet. 71:1.20.1‐1.20.15. doi: 10.1002/0471142905.hg0120s71.
  Yu, J., Pressoir, G., Briggs, W.H., Vroh Bi, I., Yamasaki, M., Doebley, J.F., McMullen, M.D., Gaut, B.S., Nielsen, D.M., Holland, J.B., Kresovich, S., and Buckler, E.S. 2006. A unified mixed‐model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38:203‐208. doi: 10.1038/ng1702.
PDF or HTML at Wiley Online Library