Data Storage and Analysis in ArrayExpress and Expression Profiler

Gabriella Rustici1, Misha Kapushesky1, Nikolay Kolesnikov1, Helen Parkinson1, Ugis Sarkans1, Alvis Brazma1

1 European Bioinformatics Institute (EMBL‐EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 7.13
DOI:  10.1002/0471250953.bi0713s23
Online Posting Date:  September, 2008
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

ArrayExpress at the European Bioinformatics Institute is a public database for MIAME‐compliant microarray and transcriptomics data. It consists of two parts: the ArrayExpress Repository, which is a public archive of microarray data, and the ArrayExpress Warehouse of Gene Expression Profiles, which contains additionally curated subsets of data from the Repository. Archived experiments can be queried by experimental attributes, such as keywords, species, array platform, publication details, or accession numbers. Gene expression profiles can be queried by gene names and properties, such as Gene Ontology terms, allowing expression profiles visualization. The data can be exported and analyzed using the online data analysis tool named Expression Profiler. Data analysis components, such as data preprocessing, filtering, differentially expressed gene finding, clustering methods, and ordination‐based techniques, as well as other statistical tools are all available in Expression Profiler, via integration with the statistical package R. Curr. Protoc. Bioinform. 23:7.13.1‐7.13.27. © 2008 by John Wiley & Sons, Inc.

Keywords: gene expression; microarrays; transcriptomics; public repository; data analysis

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Querying Gene Expression Profiles
  • Basic Protocol 2: Query the AE Repository of Microarray and Transcriptomics Data
  • Basic Protocol 3: How to Upload, Normalize, Analyze, and Visualize Data in Expression Profiler
  • Basic Protocol 4: How to Perform Clustering Analysis in Expression Profiler
  • Basic Protocol 5: How to Calculate Gene Ontology Term Enrichment in Expression Profiler
  • Basic Protocol 6: How to Calculate Chromosome Co‐Localization Probability in Expression Profiler
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

Literature Cited
   Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Cherry, J., Davis, A., Dolinski, K., Dwight, S., Eppig, J., Harris, M.A., Hill, D.P., Issel‐Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., and Sherlock, G. 2000. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25: 25‐29.
   Ball, C., Brazma, A., Causton, H., Chervitz, S., Edgar, R., Hingamp, P., Matese, J.C., Icahn, C., Parkinson, H., Quackenbush, J., Ringwald, M., Sansone, S.A., Sherlock, G., Spellman, P., Stoeckert, C., Tateno, Y., Taylor, R., White, J., and Winegarden, N. 2004. An open letter on microarray data from the MGED Society. Microbiology 150: 3522‐3524.
   Benjamini, Y. and Hochberg, Y. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57: 289‐300.
   Blake, J., Schwager, C., Kapushesky, M., and Brazma, A. 2006. ChroCoLoc: An application for calculating the probability of co‐localization of microarray gene expression. Bioinformatics 22: 765‐767.
   Brazma, A., Hingamp, P., Quackenbush, J., Sherlock, G., Spellman, P., Stoeckert, C., Aach, J., Ansorge, W., Ball, C.A., Causton, H.C., Gaasterland, T., Glenisson, P., Holstege, F.C., Kim, I.F., Markowitz, V., Matese, J.C., Parkinson, H., Robinson, A., Sarkans, U., Schulze‐Kremer, S., Stewart, J., Taylor, R., Vilo, J., and Vingron, M. 2001. Minimum information about a microarray experiment (MIAME)‐toward standards for microarray data. Nat. Genet. 29: 365‐371.
   Brazma, A., Parkinson, H., Sarkans, U., Shojatalab, M., Vilo, J., Abeygunawardena, N., Holloway, E., Kapushesky, M., Kemmeren, P., Lara, G.G., Oezcimen, A., Rocca‐Serra, P., and Sansone, S.A. 2003. ArrayExpress‐a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 31: 68‐71.
   Culhane, A.C., Perriere, G., Considine, E.C., Cotter, T.G., and Higgins, D.G. 2002. Between‐group analysis of microarray data. Bioinformatics 18: 1600‐1608.
   Culhane, A.C., Thioulouse, J., Perriere, G., and Higgins, D.G. 2005. MADE4: An R package for multivariate analysis of gene expression data. Bioinformatics 21: 2789‐2790.
   Edgar, R., Domrachev, M., and Lash, A.E. 2002. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30: 207‐210.
   Hochberg, Y. 1988. A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75: 800‐803.
   Holm, S. 1979. A simple sequentially rejective Bonferroni test procedure. Scand. J. Stat. 6: 65‐70.
   Huber, W., von Heydebreck, A., Sultmann, H., Poustka, A., and Vingron, M. 2002. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18: S96‐S104.
   Ihaka, R. and Gentleman, R. 1996. R: A language for data analysis and graphics. J. Comput. Graph. Stat. 5: 299‐314.
   Ihmels, J., Friedlander, G., Bergmann, S., Sarig, O., Ziv, Y., and Barkai, N. 2002. Revealing modular organization in the yeast transcriptional network. Nat. Genet. 31: 370‐377.
   Ikeo, K., Ishi‐i, J., Tamura, T., Gojobori, T., and Tateno, Y. 2003. CIBEX: Center for information biology gene expression database. C. R. Biol. 326: 1079‐1082.
   Irizarry, R.A., Bolstad, B.M., Collin, F., Cope, L.M., Hobbs, B., and Speed, T.P. 2003. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 31:e15.
   Johansson, P. and Hakkinen, J. 2006. Improving missing value imputation of microarray data by using spot quality weights. BMC Bioinformatics 7: 306.
   Kapushesky, M., Kemmeren, P., Culhane, A.C., Durinck, S., Ihmels, J., Korner, C., Kull, M., Torrente, A., Sarkans, U., Vilo, J., and Brazma, A. 2004. Expression Profiler: Next generation‐an online platform for analysis of microarray data. Nucleic Acids Res. 32: W465‐ W470.
   Li, C. and Wong, W.H. 2001. Model‐based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc. Natl. Acad. Sci. U.S.A. 98: 31‐36.
   Manly, K.F., Nettleton, D., and Hwang, J.T. 2004. Genomics, prior probability, and statistical tests of multiple hypotheses. Genome Res. 14: 997‐1001.
   Pounds, S. 2006. Estimation and control of multiple testing error rates for microarray studies. Brief. Bioinform. 7: 25‐36.
   Quackenbush, J. 2001. Computational analysis of microarray data. Nat. Rev. Genet. 2: 418‐427.
   Quackenbush, J. 2002. Microarray data normalization and transformation. Nat. Genet. 32: 496‐501.
   Rayner, T.F., Rocca‐Serra, P., Spellman, P.T., Causton, H.C., Farne, A., Holloway, E., Irizarry, R.A., Liu, J., Maier, D.S., Miller, M., Petersen, K., Quackenbush, J., Sherlock, G., Stoeckert, C.J., White, J., Whetzel, P.L., Wymore, F., Parkinson, H., Sarkans, U., Ball, C.A., and Brazma, A. 2006. A simple spreadsheet‐based, MIAME‐supportive format for microarray data: MAGE‐TAB. BMC Bioinformatics 7: 489.
   Smyth, G.K. 2004. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3: Article3.
   Torrente, A., Kapushesky, M., and Brazma, A. 2005. A new algorithm for comparing and visualizing relationships between hierarchical and flat gene expression data clusterings. Bioinformatics 21: 3993‐3999.
   Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., and Altman, R.B. 2001. Missing value estimation methods for DNA microarrays. Bioinformatics 17: 520‐525.
   Wu, Z., Irizarry, R., Gentleman, R., Martinez‐Murillo, F., and Spencer, F. 2004. A model‐based background adjustment for oligonucleotide expression arrays. J. Am. Stat. Assoc. 99: 909‐917.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library