The UCSC Genome Browser

Donna Karolchik1, Angie S. Hinrichs1, W. James Kent1

1 Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 1.4
DOI:  10.1002/0471250953.bi0104s40
Online Posting Date:  December, 2012
The University of California Santa Cruz (UCSC) Genome Browser is a popular Web‐based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation “tracks.” The annotations generated by the UCSC Genome Bioinformatics Group and external collaborators include gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple‐species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web‐based application, the UCSC Table Browser. Users can upload personal datasets in a wide variety of formats as custom annotation tracks in both browsers for research or educational purposes. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks. Curr. Protoc. Bioinform. 40:1.4.1‐1.4.33. © 2012 by John Wiley & Sons, Inc.

Keywords: Genome Browser; Table Browser; human genome; genome analysis; comparative genomics; human variation; next‐generation sequencing; human genetics analysis; biological databases; BAM; bioinformatics; bioinformatics fundamentals

Table of Contents

  • Introduction
  • Basic Protocol 1: Using the UCSC Genome Browser
  • Support Protocol 1: Creating a Custom Annotation Track
  • Support Protocol 2: Using the UCSC Table Browser
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
Basic Protocol 1: Using the UCSC Genome Browser

  Necessary Resources
  • Unix, Windows, or Macintosh workstation with an Internet connection and a minimum display resolution of 800 × 600 dpi
  • An up‐to‐date Internet browser that supports JavaScript, such as Firefox 16.0 or higher (; Internet Explorer 7.0 or higher (; Chrome 22.0 or higher (; or Safari 5.1 or higher (; browser must have cookies enabled
  •   FigureFigure 1.4.1 The Genome Browser Gateway page, set up to span the region of chromosome 4 (chr4:41746100‐41750987) in the February 2009 hg19 human assembly (GRCh37) that corresponds to the location of the PHOX2B gene. The display range can be set to the position of a specific gene by typing the name into the Search Term text box. The user can generate a list of tracks containing certain attributes by clicking the Track Search button. Custom annotation tracks (see ) can be uploaded by clicking the Add Custom Tracks button. The initial Genome Browser display may be configured by clicking the Configure Tracks and Display button. The lower portion of this page (not shown) displays a description of the selected assembly, relevant links, and examples of queries that may be entered in the Search Term box.
  •   FigureFigure 1.4.2 The Genome Browser annotation track page zoomed out to display the PHOX2B gene and its 5′ and 3′ flanking regions on human chromosome 4 (chr4:41744878‐41752209) in the Feb. 2009 assembly (GRCh37/hg19). The navigation and configuration buttons are visible above and below the image. The red rectangle in the ideogram directly above the annotation tracks image indicates the location of the currently displayed region of the chromosome. The Common SNPs (132) track visibility has been changed from dense to pack to show individual SNPs, some of which are colored according to gene region (e.g., UTR or coding‐nonsynonymous), and the Flagged SNPs (132), showing SNPs flagged as clinically associated in dbSNP, has been added by changing the display visibility to pack. The Phenotype and Disease Associations group has been opened, and three tracks from the group have been added to the display by changing their visibilities from hide to pack: GAD View, OMIM Genes, and OMIM AV SNPs. PHOX2B is a developmental gene that has also been associated with cancer; move the mouse over the PHOX2B item in the GAD View track to see a list of diseases associated with the gene. In the Vertebrate Multiz Alignment & Conservation track, note the areas of high conservation peaking in the upstream region (to the right because PHOX2B is on the antisense strand), UTRs, and most exons, as well as part of the first intron.
  •   FigureFigure 1.4.3 The Genome Browser view displaying the bases surrounding the SNP rs2108622 on chr19 (chr1915990375‐15990487) in the Feb. 2009 human assembly (GRCh37/hg19). To view this region, enter rs2108622 in the search box, choose one of the three results in the GWAS Catalog track, and then click the Base button in the Zoom navigation section on the annotation track page. This zooms in the display to a level where single‐nucleotide bases can be studied; note the bases (A, C, G, T) drawn in the Base Position track, Conservation track, and HapMap SNPs. The orange numbers above the multi‐species alignment in the Conservation track give the number of bases present in the other species, but not in the human reference, where an orange tick mark appears below. In the HapMap SNPs track, the major allele in each population is displayed instead of the usual colored box.
  •   FigureFigure 1.4.4 The Genome Browser annotation track page displaying chromosome bands 22q13.32 and 22q13.33 on chromosome 22 (chr22:4840001‐51304566) in the Feb. 2009 human assembly (GRCh37/hg19). Several tracks useful for the display of large regions have been made visible: from the Mapping and Sequencing Tracks group, Chromosome Bands and Gap; from the Phenotype and Disease Associations, GAD View, OMIM Genes and RGD Human QTLs; and from the Variation group, Flagged SNPs (132), Mult. SNPs (132), HGDP Allele Freq, HapMap SNPs, and DGV Structural Variation. Squish display mode (see , step 5) has been set for UCSC Genes and DGV structural variation to show the density of items in those tracks along the genome. Several tracks have been hidden because they have so many items in this large region that they would display as solid bars in dense mode, or take up large amounts of vertical space if displayed in pack or squish mode.
  •   FigureFigure 1.4.5 An extended DNA Case/Color Options request to display the DNA for the chr4:41749250‐41749802 region of the Feb. 2009 (GRCh37/hg19) human assembly. This configuration sets up a display that will show UCSC Genes in uppercase, all other regions in lowercase, and Spliced ESTs in varying intensities of green, depending on the level of coverage. Common SNPs are shown in bold, and Flagged SNPs are displayed in bold and underlined.
  •   FigureFigure 1.4.6 Output from the DNA display configurations set up in Figure . Exons are shown in uppercase. Nucleotides covered by a single EST appear darker green on the screen, while regions with more EST alignments appear progressively brighter, saturating at four ESTs. Common and Flagged SNPs are called out.
  •   FigureFigure 1.4.7 A BLAT search set up to align the FASTA sequence in the text box against the Feb. 2009 (GRCh37/hg19) human genome assembly. This sequence was obtained by copying and pasting the output from the Get DNA search illustrated in Figures and .
  •   FigureFigure 1.4.8 The results returned by the BLAT search shown in Figure . Clicking on the Browser link for a given line will display the data in the Genome Browser; the Details link will display a page showing a base‐by‐base of the alignment to the genome.
  •   FigureFigure 1.4.9 Sample custom annotation tracks containing BED, PSL, and GFF data formats. To load correctly, the track line data in the PSL and GFF examples must be tab‐separated. Some of the line breaks shown in the BED and PSL examples are artificial (to make the text fit on the page); browser, track, and data lines may not contain internal line breaks.
  •   FigureFigure 1.4.10 The annotation track that displays when the BED track example in Figure is uploaded into the Genome Browser. Note that the lower score value in the ItemB data results in lighter shading of this feature.
  •   FigureFigure 1.4.11 An example of a custom annotation track definition for an indexed BAM file that resides on the NCBI FTP server specified by the bigDataUrl attribute. The line breaks are artificial (to make the text fit on the page). No data lines follow the track definition line because the data are retrieved (as needed) from the remote BAM file named in the bigDataUrl setting. BigWig, BigBed and tabix‐indexed VCF custom tracks have a similar structure.
  •   FigureFigure 1.4.12 The track display of the uploaded BAM format custom track file shown in Figure .
  •   FigureFigure 1.4.13 The Table Browser tool provides access to the database tables underlying the Genome Browser annotations; in this case, the chromosome 7 data in the knownGene table on the Feb. 2009 human genome assembly (GRCh37/hg19).
  •   FigureFigure 1.4.14 Output from the Table Browser query described in , steps 4 to 6, showing regions of chromosome 7 in the Feb. 2009 (GrCh37/hg19) human genome assembly associated with the identifiers NM_014390, NM_022143, D49487, and NM_018077.


Internet Resources
  The UCSC Genome Bioinformatics and Genome Browser home page.
  The UCSC Genome Browser downloads server.
  The Genome Browser public MySql server.
  The UCSC Genome Browser User's Guide.
  The UCSC Table Browser User's Guide.
  Information for constructing and uploading a custom annotation track.
  UCSC Genome Browser ENCODE portal.
  User‐editable Web site for sharing information related to the browser.
  Mailing list for questions and discussions about the browser software, database, and genome assemblies.
  Mailing list for announcements about releases of browser software and data, server maintenance, etc.
  Mailing list for questions and discussion about mirroring the UCSC Genome Browser.
