Using the Ensembl Genome Server to Browse Genomic Sequence Data

Xosé M. Fernández‐Suárez1, Michael K. Schuster1

1 EMBL‐European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 1.15
DOI:  10.1002/0471250953.bi0115s30
Online Posting Date:  June, 2010
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


The Ensembl project provides a comprehensive source of automatic annotation of the human genome sequence, as well as other species of biomedical interest, with confirmed gene predictions that have been integrated with external data sources. This unit describes how to use the Ensembl genome browser (, the public interface of the project. It describes how to find a gene or protein of interest, how to get additional information and external links, and how to use the comparative genomic data. Curr. Protoc. Bioinform. 30:1.15.1‐1.15.48. © 2010 by John Wiley & Sons, Inc.

Keywords: computer graphics; databases; genetic; genetic variation; genomics; sequence homology; genome; genome sequence

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Search by Text/Keyword/Gene Name
  • Basic Protocol 2: Examining a Gene
  • Basic Protocol 3: Examining a Genomic Location
  • Support Protocol 1: Comparative Genomics: Gene Trees, Orthologues, and Paralogues
  • Support Protocol 2: Comparative Genomics: Pairwise Whole Genome Alignments
  • Support Protocol 3: Comparative Genomics: Multiple Whole Genome Alignments
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

   Bruford, E.A., Lush, M.J., Wright, M.W., Sneddon, T.P., Povey, S., and Birney, E. 2008. The HGNC Database in 2008: A resource for the human genome. Nucleic Acids Res. 36:D445‐D448.
   Curwen, V., Eyras, E., Andrews, T.D., Clarke, L., Mongin, E., Searle, S.M., and Clamp, M. 2004. The Ensembl automatic gene annotation system. Genome Res. 14:942‐950.
   Flicek, P., Aken, B.L., Ballester, B., Beal, K., Bragin, E., Brent, S., Chen, Y., Clapham, P., Coates, G., Fairley, S., Fitzgerald, S., Fernandez‐Banet, J., Gordon, L., Graf, S., Haider, S., Hammond, M., Howe, K., Jenkinson, A., Johnson, N., Kahari, A., Keefe, D., Keenan, S., Kinsella, R., Kokocinski, F., Koscielny, G., Kulesha, E., Lawson, D., Longden, I., Massingham, T., McLaren, W., Megy, K., Overduin, B., Pritchard, B., Rios, D., Ruffier, M., Schuster, M., Slater, G., Smedley, D., Spudich, G., Tang, Y.A., Trevanion, S., Vilella, A., Vogel, J., White, S., Wilder, S.P., Zadissa, A., Birney, E., Cunningham, F., Dunham, I., Durbin, R., Fernandez‐Suarez, X.M., Herrero, J., Hubbard, T.J., Parker, A., Proctor, G., Smith, J., and Searle, S.M. 2009. Ensembl's 10th year. Nucleic Acids Res. 38:D557‐ D562.
   Gentleman, R.C., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A.J., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J.Y., and Zhang, J. 2004. Bioconductor: Open software development for computational biology and bioinformatics. Genome Biol. 5:R80.
   Haider, S., Ballester, B., Smedley, D., Zhang, J., Rice, P., and Kasprzyk, A. 2009. BioMart central portal–unified access to biological data. Nucleic Acids Res. 37:W23‐W27.
   Hunter, S., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bork, P., Das, U., Daugherty, L., Duquenne, L., Finn, R.D., Gough, J., Haft, D., Hulo, N., Kahn, D., Kelly, E., Laugraud, A., Letunic, I., Lonsdale, D., Lopez, R., Madera, M., Maslen, J., McAnulla, C., McDowall, J., Mistry, J., Mitchell, A., Mulder, N., Natale, D., Orengo, C., Quinn, A.F., Selengut, J.D., Sigrist, C.J., Thimma, M., Thomas, P.D., Valentin, F., Wilson, D., Wu, C.H., and Yeats, C. 2009. InterPro: The integrative protein signature database. Nucleic Acids Res. 37:D211‐D215.
   Jenkinson, A.M., Albrecht, M., Birney, E., Blankenburg, H., Down, T., Finn, R.D., Hermjakob, H., Hubbard, T.J., Jimenez, R.C., Jones, P., Kahari, A., Kulesha, E., Macias, J.R., Reeves, G.A., and Prlic, A. 2008. Integrating biological data‐the Distributed Annotation System. BMC Bioinformatics 9:S3.
   Leinonen, R., Akhtar, R., Birney, E., Bonfield, J., Bower, L., Corbett, M., Cheng, Y., Demiralp, F., Faruque, N., Goodgame, N., Gibson, R., Hoad, G., Hunter, C., Jang, M., Leonard, S., Lin, Q., Lopez, R., Maguire, M., McWilliam, H., Plaister, S., Radhakrishnan, R., Sobhany, S., Slater, G., Ten Hoopen, P., Valentin, F., Vaughan, R., Zalunin, V., Zerbino, D., and Cochrane, G. 2009. Improvements to services at the European Nucleotide Archive. Nucleic Acids Res. 38:D39‐D45.
   Paten, B., Herrero, J., Beal, K., Fitzgerald, S., and Birney, E. 2008a. Enredo and Pecan: Genome‐wide mammalian consistency‐based multiple alignment with paralogs. Genome Res. 18:1814‐1828.
   Paten, B., Herrero, J., Fitzgerald, S., Beal, K., Flicek, P., Holmes, I., and Birney, E. 2008b. Genome‐wide nucleotide‐level mammalian ancestor reconstruction. Genome Res. 18:1829‐1843.
   Paten, B., Herrero, J., Beal, K., and Birney, E. 2009. Sequence progressive alignment, a framework for practical large‐scale probabilistic consistency alignment. Bioinformatics 25:295‐301.
   Potter, S.C., Clarke, L., Curwen, V., Keenan, S., Mongin, E., Searle, S.M., Stabenau, A., Storey, R., and Clamp, M. 2004. The Ensembl analysis pipeline. Genome Res. 14:934‐941.
   Pruitt, K.D., Tatusova, T., Klimke, W., and Maglott, D.R. 2009. NCBI Reference Sequences: Current status, policy and new initiatives. Nucleic Acids Res. 37:D32‐D36.
   Ruan, J., Li, H., Chen, Z., Coghlan, A., Coin, L.J., Guo, Y., Heriche, J.K., Hu, Y., Kristiansen, K., Li, R., Liu, T., Moses, A., Qin, J., Vang, S., Vilella, A.J., Ureta‐Vidal, A., Bolund, L., Wang, J., and Durbin, R. 2008. TreeFam: 2008 Update. Nucleic Acids Res. 36:D735‐D740.
   Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., Dicuccio, M., Federhen, S., Feolo, M., Geer, L.Y., Helmberg, W., Kapustin, Y., Landsman, D., Lipman, D.J., Lu, Z., Madden, T.L., Madej, T., Maglott, D.R., Marchler‐Bauer, A., Miller, V., Mizrachi, I., Ostell, J., Panchenko, A., Pruitt, K.D., Schuler, G.D., Sequeira, E., Sherry, S.T., Shumway, M., Sirotkin, K., Slotta, D., Souvorov, A., Starchenko, G., Tatusova, T.A., Wagner, L., Wang, Y., John Wilbur, W., Yaschenko, E., and Ye, J. 2009. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 38:D5‐D16.
   UniProt Consortium. 2008. The universal protein resource (UniProt). Nucleic Acids Res. 36:D190‐D195.
   Vilella, A.J., Severin, J., Ureta‐Vidal, A., Heng, L., Durbin, R., and Birney, E. 2009. EnsemblCompara GeneTrees: Complete, duplication‐aware phylogenetic trees in vertebrates. Genome Res. 19:327‐335.
   Waterhouse, A.M., Procter, J.B., Martin, D.M., Clamp, M., and Barton, G.J. 2009. Jalview Version 2–a multiple sequence alignment editor and analysis workbench. Bioinformatics 25:1189‐1191.
   Wilming, L.G., Gilbert, J.G., Howe, K., Trevanion, S., Hubbard, T., and Harrow, J.L. 2008. The vertebrate genome annotation (Vega) database. Nucleic Acids Res. 36:D753‐D760.
Internet Resources
  Ensembl project home page
  BioMart Project
  Vertebrate Genome Annotation (VEGA) at Sanger Institute
  HUGO Gene Nomenclature Committee (HGNC)
  Gene Ontology Consortium
  Distributed Annotation System (DAS) and BioDAS
  The cpgreport program written by Gos Micklem is available on this site.
  RepeatMasker program
PDF or HTML at Wiley Online Library