Genotype Imputation in Genome‐Wide Association Studies

Eleonora Porcu1, Serena Sanna1, Christian Fuchsberger2, Lars G. Fritsche2

1 Istituto di Ricerca Genetica e Biomedica (IRGB), CNR, Monserrato, Cagliari, 2 Department of Biostatistics, Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, Michigan
Publication Name:  Current Protocols in Human Genetics
Unit Number:  Unit 1.25
DOI:  10.1002/0471142905.hg0125s78
Online Posting Date:  July, 2013
Imputation is an in silico method that can increase the power of association studies by inferring missing genotypes, harmonizing data sets for meta‐analyses, and increasing the overall number of markers available for association testing. This unit provides an introductory overview of the imputation method and describes a two‐step imputation approach that consists of the phasing of the study genotypes and the imputation of reference panel genotypes into the study haplotypes. Detailed steps for data preparation and quality control illustrate how to run the computationally intensive two‐step imputation with the high‐density reference panels of the 1000 Genomes Project, which currently integrates more than 39 million variants. Additionally, the influence of reference panel selection, input marker density, and imputation settings on imputation quality are demonstrated with a simulated data set to give insight into crucial points of successful genotype imputation. Curr. Protoc. Hum. Genet. 78:1.25.1‐1.25.14. © 2013 by John Wiley & Sons, Inc.

Keywords: genome‐wide association studies; imputation; linkage disequilibrium; inference; imputation; 1000 Genomes Project; HapMap Project; rare variants; genotyping arrays

Table of Contents

  • Introduction
  • Imputation Methods: Overview
  • Data Preparation
  • Step 1: Prephasing
  • Step 2: Imputation
  • Measuring Imputation Quality
  • Association Testing
  • Conclusions
  • Literature Cited
  • Figures
  • Tables
Literature Cited

Internet Resources
  Tutorial for the MACH 1.0 program for carrying out genotype imputation.
  Frequently asked questions about the MaCH program.
  Using the minimac program to carry out genotype imputation.
  The 1000 Genomes Imputation Cookbook contains detailed documentation and example scripts for the MaCH+minimac platform.
  The 1000 Genomes Imputation Cookbook contains detailed documentation and example scripts for the IMPUTE2 platform.
  The 1000 Genomes Project Web site.
  The HapMap Project Web site.∼yunmli/software.html
  Web site for Li Group Software.
  HAPGEN software for simulating haplotypes.
