Analysis of Expression Data: An Overview

Anoop Grewal1, Peter Lambert1, Jordan Stockton2

1 NextBio, Cupertino, California, 2 Agilent Technologies, Santa Clara, California
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 7.1
DOI:  10.1002/0471250953.bi0701s17
Online Posting Date:  March, 2007
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


After providing a brief introduction to microarray chips and experimental details, this overview discusses analysis techniques. Data analysis from microarray experiments generally involves two parts: acquiring and normalizing the data, and interpreting it. This unit focuses mostly on the latter, as it is less technology‚Äźspecific.

Keywords: microarray data analysis; genomic expression; transcriptomics

PDF or HTML at Wiley Online Library

Table of Contents

  • Experimental Design
  • Raw Data Output
  • Data Normalization
  • Data Analysis
  • Informatics and Databases
  • Conclusion
  • Literature Cited
  • Tables
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

   Asburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel‐Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., and Sherlock, G. 2000. Gene Ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nature Genet. 25:25‐29.
   Benjamini, Y. and Hochberg, Y. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. B 57:2889‐3000.
   Bolstad, B.M. 2006. RMAExpress. URL:
   Brenner, S., Johnson, M., Bridgham, J., Golda, G., Lloyd, D.H., Johnson, D., Luo, S., McCurdy, S., Foy, M., Ewan, M., Roth, R., George, D., Eletr, S., Albrecht, G., Vermaas, E., Williams, S.R., Moon, K., Burcham, T., Pallas, M., DuBridge, R.B., Kirchner, J., Fearon, K., Mao, J., and Corcoran, K. 2000. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 18:630‐634.
   Cope, L.M., Irizarry, R.A., Jaffee, H.A., Wu, Z., and Speed, T.P. 2004. A benchmark for Affymetrix GeneChip expression measures. Bioinformatics 20:323‐331.
   Dahlquist, K.D., Salomonis, N., Vranizan, K., Lawlor, S.C., and Conklin, B.R. 2002. GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat. Genet. 31:19‐20.
   Dudoit, S., Fridlyand, J., and Speed, T. 2000. Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Tech. Rep. 576, Dept. of Statistics, University of California, Berkeley.
   Durbin, B.P., Hardin, J.S., Hawkins, D.M., and Rocke, D.M. 2002. A variance‐stabilizing transformation for gene expression microarray data. Bioinformatics 18:S105‐S110.
   Fehlbaum, P., Guihal, C., Bracco, L., and Cochet, O. 2005. A microarray configuration to quantify expression levels and relative abundance of splice variants. Nucleic Acids Res. 10:e47.
   GeneLogic. 2002. Datasets.
   Gentleman, R.C., Carey, V.J., Bates, D.J., Bolstad, B.M., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A.J., Sawitzki, G., Smith, C., Smyth, G.K., Tierney, L., Yang, Y.H., and Zhang, J. 2004. Bioconductor: Open software development for computational biology and bioinformatics. Genome Biol. 5:R80.
   Hughes, J.D., Estep, P.W., Tavazoie, S., and Church, G.M. 2000. Computational identification of cis‐regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296:1205‐1214.
   Irizarry, R.A., Bolstad, B.M., Collin, F., Cope, L.M., Hobbs, B., and Speed, T.P. 2003. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 31:e15.
   Kanehisa, M., Goto, S., Kawashima, S., and Nakaya, A. 2002. The KEGG databases at GenomeNet. Nucleic Acids Res. 30:42‐46.
   Kerr, M.K. and Churchill, G.A. 2001. Statistical design and the analysis of gene expression microarrays. Genet. Res. 77:123‐128.
   Kerr, M.K., Martin, M., and Churchill, G.A. 2000. Analysis of variance for gene expression microarray data. J. Comput. Biol. 7:819‐837.
   Li, C. and Wong, W. 2001. Model‐based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc. Natl. Acad. Sci. U.S.A. 98:31‐36.
   Lipshutz, R., Fodor, S., Gingeras, T., and Lockart, D. 1999. High density synthetic oligonucleotide arrays. Nat. Genet. 21:20‐24.
   Pan, Q., Shai, O., Misquitta, C., Zhang, W., Saltzman, A.L., Mohammad, N., Babak, T., Siu, H., Hughes, T.R., Morris, Q.D., Frey, B.J., and Blencowe, B.J. 2004. Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Mol. Cell. 16:929‐941.
   Saeed, A.I., Sharov, V., White, J., Li, J., Liang, W., Bhagabati, N., Braisted, J., Klapa, M., Currier, T., Thiagarajan, M., Sturn, A., Snuffin, M., Rezantsev, A., Popov, D., Ryltsov, A., Kostukovich, E., Borisovsky, I., Liu, Z., Vinsavich, A., Trush, V., and Quackenbush, J. 2003. TM4: A free, open‐source system for microarray data management and analysis. Biotechniques 34:374‐378.
   Schena, M., Shalon, D., Davis, R.W., and Brown, P.O. 1995. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467‐470.
   Shalon, D., Smith, S.J., and Brown, P.O. 1996. A DNA microarray system for analyzing complex DNA samples using two‐color fluorescent probe hybridization. Genome Res. 6:639‐645.
   Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., and Ideker, T. 2003. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13:2498‐2504.
   Stein, A., Van Loo, P., Thijs, G., Mayer, H., de Martin, R., Moreau, Y., and De Moor, B. 2005. TOUCAN2: The all‐inclusive open source workbench for regulatory sequence analysis. Nucleic Acids Res. 33:W393‐W396.
   Thijs, G., Marchal, K., Lescot, M., Rombauts, S., De Moor, B., Rouze, P., and Moreau, Y. 2002. A Gibbs sampling method to detect over‐represented motifs in upstream regions of coexpressed genes. J. Comput. Biol. 9:447‐464.
   Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G. 2002. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. U.S.A. 99:6567‐6572.
   Tusher, V.G., Tibshirani, R., and Chu, G. 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. U.S.A. 98:5116‐5121.
   Wolfsberg, T.G., Gabrielian, A.E., Campbell, M.J., Cho, R.J., Spouge, J.L., and Landsman, D. 1999. Candidate regulatory sequence elements for cell cycle–dependent transcription in Saccharomyces cerevisiae. Gen. Res. 9:775‐792.
   Wu, Z., LeBlanc, R., and Irizarry, R.A. 2004. Stochastic Models Based on Molecular Hybridization Theory for Short Oligonucleotide Microarrays Technical report, Johns Hopkins University, Dept. of Biostatistics Working Papers.
   Yang, Y.H., Dudoit, S., Luu, P., Lin, D.M., Peng, V., Ngai, J., and Speed, T.P. 2002. Normalization for cDNA microarray data: A robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 4:e15.
Internet Resources
  The Gene Expression Expression Omnibus (GEO) is a public database of expression data derived from a number of different expression analysis technologies.
  ArrayExpress is a public repository for gene expression data, focused on providing a rich source of experimental background for each experiment set.
  Web site for Biocarta Pathways—interactive graphic models of molecular and cellular pathways.
  Kyoto Encyclopedia of Genes and Genomes.
PDF or HTML at Wiley Online Library