Galaxy: A Web‐Based Genome Analysis Tool for Experimentalists

Daniel Blankenberg1, Gregory Von Kuster1, Nathaniel Coraor1, Guruprasad Ananda1, Ross Lazarus1, Mary Mangan2, Anton Nekrutenko1, James Taylor1

1 The Galaxy Team, Pennsylvania State University, University Park, Pennsylvania, 2 OpenHelix LLC, Bellevue, Washington
Publication Name:  Current Protocols in Molecular Biology
Unit Number:  Unit 19.10
DOI:  10.1002/0471142727.mb1910s89
Online Posting Date:  January, 2010
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

High‐throughput data production has revolutionized molecular biology. However, massive increases in data generation capacity require analysis approaches that are more sophisticated, and often very computationally intensive. Thus, making sense of high‐throughput data requires informatics support. Galaxy (http://galaxyproject.org) is a software system that provides this support through a framework that gives experimentalists simple interfaces to powerful tools, while automatically managing the computational details. Galaxy is distributed both as a publicly available Web service, which provides tools for the analysis of genomic, comparative genomic, and functional genomic data, or a downloadable package that can be deployed in individual laboratories. Either way, it allows experimentalists without informatics or programming expertise to perform complex large‐scale analysis with just a Web browser. Curr. Protoc. Mol. Biol. 89:19.10.1‐19.10.21. © 2010 by John Wiley & Sons, Inc.

Keywords: Galaxy; analysis; bioinformatics; workflow; algorithm; pipeline; genomics; SNPs

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: An Introduction to the Galaxy Approach: Finding Promoters Containing TAF1 Binding Sites Identified From a ChIP‐Seq Experiment
  • Basic Protocol 2: Combining and Filtering Genome Annotations: Finding Exons with the Highest Number of Nucleotide Polymorphisms
  • Support Protocol 1: Saving Results in Galaxy and Sharing Data with Others
  • Basic Protocol 3: Generating a Workflow From a History in Galaxy
  • Support Protocol 2: Modify a Parameter in the Workflow in Galaxy
  • Support Protocol 3: Running Workflows with Galaxy
  • Support Protocol 4: Sharing Workflows with Galaxy
  • Basic Protocol 4: Generating Workflows from Scratch with Galaxy
  • Basic Protocol 5: Extracting Sequences and Alignments with Galaxy: An SNPs in Exons Example
  • Commentary
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: An Introduction to the Galaxy Approach: Finding Promoters Containing TAF1 Binding Sites Identified From a ChIP‐Seq Experiment

  Materials
  • A file containing genomic coordinates for TAF1‐binding sites from the ChIP‐Seq experiment (an example file can be downloaded at http://galaxy.psu.edu/CPMB/TAF1_ChIP.txt; Kim et al., )
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)

Basic Protocol 2: Combining and Filtering Genome Annotations: Finding Exons with the Highest Number of Nucleotide Polymorphisms

  Materials
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)
NOTE: It is beneficial to clear the current history and start re‐numbering from 1 by accessing the History Options and selecting Create New. It simplifies following the numbered steps.

Support Protocol 1: Saving Results in Galaxy and Sharing Data with Others

  Materials
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)
  • Results from protocol 2
  • A Galaxy account (created by clicking Register in the Galaxy interface); histories must be linked to a user to be stored and shared

Basic Protocol 3: Generating a Workflow From a History in Galaxy

  Materials
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)
  • History created from protocol 2
  • A Galaxy account (created by clicking Register in the Galaxy interface); all workflow manipulation in Galaxy requires the user to be logged in with an account

Support Protocol 2: Modify a Parameter in the Workflow in Galaxy

  Materials
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)
  • Workflow created by protocol 4

Support Protocol 3: Running Workflows with Galaxy

  Materials
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)
  • Workflow saved in protocol 4

Support Protocol 4: Sharing Workflows with Galaxy

  Materials
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)
  • Workflow created by protocol 4

Basic Protocol 4: Generating Workflows from Scratch with Galaxy

  Materials
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)
  • A Galaxy account (created by clicking Register in the Galaxy interface); all workflow manipulation in Galaxy requires the user to be logged in with an account

Basic Protocol 5: Extracting Sequences and Alignments with Galaxy: An SNPs in Exons Example

  Materials
  • An internet‐accessible computer with any modern Web browser (Firefox, Safari, Opera, Internet Explorer)
  • Completed and saved history created by protocol 2 and protocol 3
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

Literature Cited
   Karolchik, D., Hinrichs, A.S., Furey, T.S., Roskin, K.M., Sugnet, C.W., Haussler, D., and Kent, W.J. 2004. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32:D493‐D496.
   Karolchik, D., Kuhn, R.M., Baertsch, R., Barber, G.P., Clawson, H., Diekhans, M., Giardine, B., Harte, R.A., Hinrichs, A.S., Hsu, F., Miller, W., Pedersen, J.S., Pohl, A., Raney, B.J., Rhead, B., Rosenbloom, K.R., Smith, K.E., Stanke, M., Thakkapallayil, A., Trumbower, H., Wang, T., Zweig, A.S., Haussler, D., and Kent, W.J. 2008. The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res. 36:D773‐D779.
   Kim, T.H., Barrera, L.O., Zheng, M., Qu, C., Singer, M.A., Richmond, T.A., Wu, Y., Green, R.D., and Ren, B. 2005. A high‐resolution map of active promoters in the human genome. Nature 436:876‐880.
   Taylor, J., Schenck, I., Blankenberg, D., and Nekrutenko, A. 2007. Using galaxy to perform large‐scale interactive data analyses. Curr. Protoc. Bioinformatics 19:10.5.1‐10.5.25.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library