Using QIIME to Analyze 16S rRNA Gene Sequences from Microbial Communities

Justin Kuczynski1, Jesse Stombaugh2, William Anton Walters1, Antonio González3, J. Gregory Caporaso4, Rob Knight2

1 Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, 2 Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, 3 Department of Computer Science, University of Colorado, Boulder, Colorado, 4 Department of Computer Science, Northern Arizona University, Flagstaff, Arizona
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 10.7
DOI:  10.1002/0471250953.bi1007s36
Online Posting Date:  December, 2011
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


QIIME (canonically pronounced “chime”) is a software application that performs microbial community analysis. It is an acronym for Quantitative Insights Into Microbial Ecology, and has been used to analyze and interpret nucleic acid sequence data from fungal, viral, bacterial, and archaeal communities. The following protocols describe how to install QIIME on a single computer and use it to analyze microbial 16S sequence data from nine distinct microbial communities. Curr. Protoc. Bioinform. 36:10.7.1‐10.7.20. © 2011 by John Wiley & Sons, Inc.

Keywords: microbial ecology; 16S; SSU; software; bioinformatics; QIIME

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Acquiring an Example Study and Demultiplexing DNA Sequences
  • Basic Protocol 2: Picking OTUs, Assigning Toxonomy, Inferring Phylogeny, and Creating an OTU Table
  • Basic Protocol 3: Alpha Diversity within Samples and Rarefaction Curves
  • Basic Protocol 4: Beta Diversity Between Samples and Beta Diversity Plots
  • Support Protocol 1: Installing QIIME via VirtualBox
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library


  •   FigureFigure 10.7.1 A screenshot of the QIIME VirtualBox, with the terminal icon indicated, and a terminal window open.
  •   FigureFigure 10.7.2 Contents of the mapping file (Fasting_Map.txt). Note that the SampleIDs contain only letters, numbers, and period characters.
  •   FigureFigure 10.7.3 The first few lines of the taxonomy assignment file, showing on each line the OTU identifier, the representative sequence identifier, the taxonomy assigned to that sequence, and the confidence in that assignment.
  •   FigureFigure 10.7.4 A visualization of the phylogenetic tree using FigTree. The tips are unlabeled here, but can be inspected interactively.
  •   FigureFigure 10.7.5 An OTU table heatmap, showing the relative abundance of each OTU within each microbial community.
  •   FigureFigure 10.7.6 Magic‐Table visualization of the OTU table heatmap.
  •   FigureFigure 10.7.7 An OTU table heatmap showing taxonomy assignment for each OTU.
  •   FigureFigure 10.7.8 An area chart showing the relative abundance of each phylum within each microbial community.
  •   FigureFigure 10.7.9 A bar chart of phylum level abundance within communities, similar to Figure 10.7.8.
  •   FigureFigure 10.7.10 A Web browser window displaying rarefaction plots. The vertical axis displays the diversity of the community, while the horizontal axis displays the number of sequences considered in the diversity calculation. Each line on the figure represents the average of all microbial belonging to a group within a category: here the green line represents all fasted mouse communities, and the blue line represents the control communities.
  •   FigureFigure 10.7.11 A visualization of bootstrap‐supported hierarchical clustering of the 9 microbial communities under investigation. Note that the fasted mouse communities (PC.6xx) cluster together, and the result is supported by jackknife tests (red implies > 75% support).
  •   FigureFigure 10.7.12 A Principal Coordinates plot of the 9 communities, showing jackknife‐supported confidence ellipsoids. The first two principal axes are shown.


Literature Cited

Literature Cited
   Arumugam, M., Raes, J., Pelletier, E., Le Paslier, D., Yamada, T., Mende, D.R., Fernandes, G.R., Tap, J., Bruls, T., Batto, J.M., Bertalan, M., Borruel, N., Casellas, F., Fernandez, L., Gautier, L., Hansen, T., Hattori, M., Hayashi, T., Kleerebezem, M., Kurokawa, K., Leclerc, M., Levenez, F., Manichanh, C., Nielsen, H.B., Nielsen, T., Pons, N., Poulain, J., Qin, J., Sicheritz‐Ponten, T., Tims, S., Torrents, D., Ugarte, E., Zoetendal, E.G., Wang, J., Guarner, F., Pedersen, O., de Vos, W.M., Brunak, S., Doré, J.; MetaHIT, Consortium, Antolín, M., Artiguenave, F., Blottiere, H.M., Almeida, M., Brechot, C., Cara, C., Chervaux, C., Cultrone, A., Delorme, C., Denariaz, G., Dervyn, R., Foerstner, K.U., Friss, C., van de Guchte, M., Guedon, E., Haimet, F., Huber, W., van Hylckama‐Vlieg, J., Jamet, A., Juste, C., Kaci, G., Knol, J., Lakhdari, O., Layec, S., Le Roux, K., Maguin, E., Mérieux, A., Melo Minardi, R., M'rini, C., Muller, J., Oozeer, R., Parkhill, J., Renault, P., Rescigno, M., Sanchez, N., Sunagawa, S., Torrejon, A., Turner, K., Vandemeulebrouck, G., Varela, E., Winogradsky, Y., Zeller, G., Weissenbach, J., Ehrlich, S.D., and Bork, P. 2011. Enterotypes of the human gut microbiome. Nature 474:666.
   Caporaso, J.G., Bittinger, K., Bushman, F.D., DeSantis, T.Z., Andersen, G.L., and Knight, R. 2010. PyNAST: A flexible tool for aligning sequences to a template alignment. Bioinformatics 26:266‐267.
   Caporaso, J.G., Lauber, C.L., Walters, W.A., Berg‐Lyons, D., Lozupone, C.A., Turnbaugh, P.J., Fierer, N., and Knight, R. 2011. Global patterns of 16s rrna diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. U.S.A. 108:4516.
   Crawford, P.A., Crowley, J.R., Sambandam, N., Muegge, B.D., Costello, E.K., Hamady, M., Knight, R., and Gordon, J.I. 2009. Regulation of myocardial ketone body metabolism by the gut microbiota during nutrient deprivation. Proc. Natl. Acad. Sci. U.S.A. 106:11276‐11281.
   Edgar, R.C. 2004. MUSCLE: A multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113.
   Legendre, P. and Legendre, L. 1998. Numerical Ecology. Elsevier Science, New York.
   Price, M.N., Dehal, P.S., and Arkin, A.P. 2010. Fasttree 2‐approximately maximum‐likelihood trees for large alignments. PLoS One 5:e9490.
   Quail, M.A., Kozarewa, I., Smith, F.A., Scally, P., Stephens, J., Durbin, R., Swerdlow, H., and Turner, D.J. 2008. A large genome center's improvements to the illumina sequencing system. Nat. Methods 5:1005‐1010.
   Schwartz, D.C. and Waterman, M.S. 2010. New generations: Sequencing machines and their computational challenges. J. Comp. Sci. Tech. 25:3‐9.
   Wang, Q., Garrity, G.M., Tiedje, J.M., and Cole, J.R. 2007. Naive bayesian classifier for rapid assignment of rrna sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73:5261.
PDF or HTML at Wiley Online Library