Disease and Phenotype Data at Ensembl

Giulietta M. Spudich1, Xosé M. Fernández‐Suárez1

1 EMBL‐European Bioinformatics Institute, Cambridge, United Kingdom
Publication Name:  Current Protocols in Human Genetics
Unit Number:  Unit 6.11
DOI:  10.1002/0471142905.hg0611s69
Online Posting Date:  February, 2011
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


Biological databases are an important resource for the life sciences community. Accessing the hundreds of databases supporting molecular biology and related fields is a daunting and time‐consuming task. Integrating this information into one access point is a necessity for the life sciences community, which includes researchers focusing on human disease. Here we discuss the Ensembl genome browser, which acts as a single entry point with Graphical User Interface to data from multiple projects, including OMIM, dbSNP, and the NHGRI GWAS catalog. Ensembl provides a comprehensive source of annotation for the human genome, along with other species of biomedical interest. In this unit, we explore how to use the Ensembl genome browser in example queries related to human genetic diseases. Support protocols demonstrate quick sequence export using the BioMart tool. Curr. Protoc. Hum. Genet. 69:6.11.1‐6.11.34 © 2011 by John Wiley & Sons, Inc.

Keywords: computer graphics; databases; genetic variation; genomics; cytogenetics; sequence homology; sequence alignment; informatics; computational biology

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Exploring an SNP Associated with Hemochromatosis
  • Basic Protocol 2: Exploring a Nonsynonymous Variation in the MYC Gene
  • Basic Protocol 3: Sequence Matches and Individual Genomes
  • Basic Protocol 4: A Cytogeneticist's View
  • Support Protocol 1: Sequence Export
  • Support Protocol 2: Variation Export
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


Basic Protocol 1: Exploring an SNP Associated with Hemochromatosis

  • A computer with a connection to the Internet
  • An up‐to‐date Web browser that supports JavaScript, such as Firefox, Safari, or the most recent version of Internet Explorer (at the time of writing, IE7 and IE8)

Basic Protocol 2: Exploring a Nonsynonymous Variation in the MYC Gene

  • A computer with a connection to the Internet
  • An up‐to‐date Web browser that supports JavaScript, such as Firefox, Safari, or the most recent version of Internet Explorer (at the time of writing, IE7 and IE8)

Basic Protocol 3: Sequence Matches and Individual Genomes

  • A computer with a connection to the Internet
  • An up‐to‐date Web browser that supports JavaScript, such as Firefox, Safari, or the most recent version of Internet Explorer (at the time of writing, IE7 and IE8)

Basic Protocol 4: A Cytogeneticist's View

  Necessary Resources
  • A computer with a connection to the Internet
  • An up‐to‐date Web browser that supports JavaScript, such as Firefox, Safari, or the most recent version of Internet Explorer (at the time of writing, IE7 and IE8)

Support Protocol 1: Sequence Export

  Necessary Resources
  • A computer with a connection to the Internet
  • An up‐to‐date Web browser that supports JavaScript, such as Firefox, Safari, or the most recent version of Internet Explorer (at the time of writing, IE7 and IE8)

Support Protocol 2: Variation Export

  Necessary Resources
  • A computer with a connection to the Internet
  • An up‐to‐date Web browser that supports JavaScript, such as Firefox, Safari, or the most recent version of Internet Explorer (at the time of writing, IE7 and IE8)
PDF or HTML at Wiley Online Library



Literature Cited

   Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel‐Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., and Sherlock, G. 2000. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25:25‐29.
   Attwood, T.K., Bradley, P., Flower, D.R., Gaulton, A., Maudling, N., Mitchell, A.L., Moulton, G., Nordle, A., Paine, K., Taylor, P., Uddin, A., and Zygouri, C. 2003. PRINTS and its automatic supplement, preprints. Nucleic Acids Res. 31:400‐402.
   Benyamin, B., McRae, A.F., Zhu, G., Gordon, S., Henders, A.K., Palotie, A., Peltonen, L., Martin, N.G., Montgomery, G.W., Whitfield, J.B., and Visscher, P.M. 2009. Variants in TF and HFE explain approximately 40% of genetic variation in serum‐transferrin levels. Am. J. Hum. Genet. 84:60‐65.
   Betel, D., Wilson, M., Gabow, A., Marks, D.S., and Sander, C. 2008. The microRNA.org resource: Targets and expression. Nucleic Acids Res. 36:D149‐D153.
   Borate, B. and Baxevanis, A.D. 2009. Searching Online Mendelian Inheritance in Man (OMIM) for information on genetic loci involved in human disease. Curr. Protoc. Bioinform. 27:1.2.1‐1.2.13.
   Chen, Y., Cunningham, F., Rios, D., McLaren, W., Smith, J., Pritchard, B., Spudich, G.M., Brent, S., Kulesha, E., Marin‐Garcia, P., Smedley, D., Birney, E., and Flicek, P. 2010. Ensembl variation resources. BMC Genomics 11:293.
   Cooper, G.M., Stone, E.A., Asimenos, G., NISC Comparative Sequencing Program, Green, E.D., Batzoglou, S., and Sidow, A. 2005. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15:901‐913.
   Cullen, L.M., Anderson, G.J., Ramm, G.A., Jazwinska, E.C., and Powell, L.W. 1999. Genetics of hemochromatosis. Annu. Rev. Med. 50:87‐98.
   Curwen, V., Eyras, E., Andrews, T.D., Clarke, L., Mongin, E., Searle, S.M., and Clamp, M. 2004. The Ensembl automatic gene annotation system. Genome Res. 14:942‐950.
   Dalgleish, R., Flicek, P., Cunningham, F., Astashyn, A., Tully, R.E., Proctor, G., Chen, Y., McLaren, W.M., Larsson, P., Vaughan, B.W., Broud, C., Dobson, G., Lehvslaiho, H., Taschner, P.E., den Dunnen, J.T., Devereau, A., Birney, E., Brookes, A.J., and Maglott, D.R. 2010. Locus reference genomic sequences: An improved basis for describing human DNA variants. Genome Med. 2:24.
   Deng, J., Shoemaker, R., Xie, B., Gore, A., LeProust, E.M., Antosiewicz‐Bourget, J., Egli, D., Maherali, N., Park, I.H., Yu, J., Daley, G.Q., Eggan, K., Hochedlinger, K., Thomson, J., Wang, W., Gao, Y., and Zhang, K. 2009. Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming. Nat. Biotechnol. 27:353‐360.
   ENCODE Project Consortium. 2007. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799‐816.
   Fernández‐Suárez, X.M. and Schuster, M.K. 2010. Using the Ensembl genome server to browse genomic sequence data. Curr. Protoc. Bioinformatics 30:1.15.1‐1.15.48.
   Finn, R.D., Mistry, J., Tate, J., Coggill, P., Heger, A., Pollington, J.E., Gavin, O.L., Gunasekaran, P., Ceric, G., Forslund, K., Holm, L., Sonnhammer, E.L., Eddy, S.R., and Bateman, A. 2010. The Pfam protein families database. Nucleic Acids Res. 38:D211‐D222.
   Galperin, M.T. and Cochrane, G.R. 2011. The 2011 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Res. 39:D1‐D6.
   Gene Ontology Consortium. 2010. The Gene Ontology in 2010: Extensions and refinements. Nucleic Acids Res. 38:D331‐D335.
   Gross, D.S. and Garrard, W.T. 1988. Nuclease hypersensitive sites in chromatin. Annu. Rev. Biochem. 57:159‐197.
   Haider, S., Ballester, B., Smedley, D., Zhang, J., Rice, P., and Kasprzyk, A. 2009. BioMart central portal: Unified access to biological data. Nucleic Acids Res. 37:W23‐W27.
   Hillier, L.W., Miller, W., Birney, E., Warren, W., Hardison, R.C., Ponting, C.P., Bork, P., Burt, D.W., Groenen, M.A., Delany, M.E., Dodgson, J.B., Chinwalla, A.T., Cliften, P.F., Clifton, S.W., Delehaunty, K.D., Fronick, C., Fulton, R.S., Graves, T.A., Kremitzki, C., Layman, D., Magrini, V., McPherson, J.D., Miner, T.L., Minx, P., Nash, W.E., Nhan, M.N., Nelson, J.O., Oddy, L.G., Pohl, C.S., Randall‐Maher, J., Smith, S.M., Wallis, J.W., Yang, S.P., Romanov, M.N., Rondelli, C.M., Paton, B., Smith, J., Morrice, D., Daniels, L., Tempest, H.G., Robertson, L., Masabanda, J.S., Griffin, D.K., Vignal, A., Fillon, V., Jacobbson, L., Kerje, S., Andersson, L., Crooijmans, R.P., Aerts, J., van der Poel, J.J., Ellegren, H., Caldwell, R.B., Hubbard, S.J., Grafham, D.V., Kierzek, A.M., McLaren, S.R., Overton, I.M., Arakawa, H., Beattie, K.J., Bezzubov, Y., Boardman, P.E., Bonfield, J.K., Croning, M.D., Davies, R.M., Francis, M.D., Humphray, S.J., Scott, C.E., Taylor, R.G., Tickle, C., Brown, W.R., Rogers, J., Buerstedde, J.M., Wilson, S.A., Stubbs, L., Ovcharenko, I., Gordon, L., Lucas, S., Miller, M.M., Inoko, H., Shiina, T., Kaufman, J., Salomonsen, J., Skjoedt, K., Wong, G.K., Wang, J., Liu, B., Wang, J., Yu, J., Yang, H., Nefedov, M., Koriabine, M., Dejong, P.J., Goodstadt, L., Webber, C., Dickens, N.J., Letunic, I., Suyama, M., Torrents, D., von Mering, C., Zdobnov, E.M., Makova, K., Nekrutenko, A., Elnitski, L., Eswara, P., King, D.C., Yang, S., Tyekucheva, S., Radakrishnan, A., Harris, R.S., Chiaromonte, F., Taylor, J., He, J., Rijnkels, M., Griffiths‐Jones, S., Ureta‐Vidal, A., Hoffman, M.M., Severin, J., Searle, S.M., Law, A.S., Speed, D., Waddington, D., Cheng, Z., Tuzun, E., Eichler, E., Bao, Z., Flicek, P., Shteynberg, D.D., Brent, M.R., Bye, J.M., Huckle, E.J., Chatterji, S., Dewey, C., Pachter, L., Kouranov, A., Mourelatos, Z., Hatzigeorgiou, A.G., Paterson, A.H., Ivarie, R., Brandstrom, M., Axelsson, E., Backstrom, N., Berlin, S., Webster, M.T., Pourquie, O., Reymond, A., Ucla, C., Antonarakis, S.E., Long, M., Emerson, J.J., Betran, E., Dupanloup, I., Kaessmann, H., Hinrichs, A.S., Bejerano, G., Furey, T.S., Harte, R.A., Raney, B., Siepel, A., Kent, W.J., Haussler, D., Eyras, E., Castelo, R., Abril, J.F., Castellano, S., Camara, F., Parra, G., Guigo, R., Bourque, G., Tesler, G., and Pevzner, P.A. 2004. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432:695‐716.
   Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., and Manolio, T.A. 2009. Potential etiologic and functional implications of genome‐wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. U.S.A. 106:9362‐9367.
   Horaitis, O. and Cotton, R.G. 2005. Human mutation databases. Curr. Protoc. Hum. Genet. 44:7.11.1‐7.11.13.
   International HapMap Consortium, Frazer, K.A., Ballinger, D.G., Cox, D.R., Hinds, D.A., Stuve, L.L., Gibbs, R.A., Belmont, J.W., Boudreau, A., Hardenbol, P., Leal, S.M., Pasternak, S., Wheeler, D.A., Willis, T.D., Yu, F., Yang, H., Zeng, C., Gao, Y., Hu, H., Hu, W., Li, C., Lin, W., Liu, S., Pan, H., Tang, X., Wang, J., Wang, W., Yu, J., Zhang, B., Zhang, Q., Zhao, H., Zhao, H., Zhou, J., Gabriel, S.B., Barry, R., Blumenstiel, B., Camargo, A., Defelice, M., Faggart, M., Goyette, M., Gupta, S., Moore, J., Nguyen, H., Onofrio, R.C., Parkin, M., Roy, J., Stahl, E., Winchester, E., Ziaugra, L., Altshuler, D., Shen, Y., Yao, Z., Huang, W., Chu, X., He, Y., Jin, L., Liu, Y., Shen, Y., Sun, W., Wang, H., Wang, Y., Wang, Y., Xiong, X., Xu, L., Waye, M.M., Tsui, S.K., Xue, H., Wong, J.T., Galver, L.M., Fan, J.B., Gunderson, K., Murray, S.S., Oliphant, A.R., Chee, M.S., Montpetit, A., Chagnon, F., Ferretti, V., Leboeuf, M., Olivier, J.F., Phillips, M.S., Roumy, S., Sallee, C., Verner, A., Hudson, T.J., Kwok, P.Y., Cai, D., Koboldt, D.C., Miller, R.D., Pawlikowska, L., Taillon‐Miller, P., Xiao, M., Tsui, L.C., Mak, W., Song, Y.Q., Tam, P.K., Nakamura, Y., Kawaguchi, T., Kitamoto, T., Morizono, T., Nagashima, A., Ohnishi, Y., Sekine, A., Tanaka, T., Tsunoda, T., Deloukas, P., Bird, C.P., Delgado, M., Dermitzakis, E.T., Gwilliam, R., Hunt, S., Morrison, J., Powell, D., Stranger, B.E., Whittaker, P., Bentley, D.R., Daly, M.J., de Bakker, P.I., Barrett, J., Chretien, Y.R., Maller, J., McCarroll, S., Patterson, N., Pe'er, I., Price, A., Purcell, S., Richter, D.J., Sabeti, P., Saxena, R., Schaffner, S.F., Sham, P.C., Varilly, P., Altshuler, D., Stein, L.D., Krishnan, L., Smith, A.V., Tello‐Ruiz, M.K., Thorisson, G.A., Chakravarti, A., Chen, P.E., Cutler, D.J., Kashuk, C.S., Lin, S., Abecasis, G.R., Guan, W., Li, Y., Munro, H.M., Qin, Z.S., Thomas, D.J., McVean, G., Auton, A., Bottolo, L., Cardin, N., Eyheramendy, S., Freeman, C., Marchini, J., Myers, S., Spencer, C., Stephens, M., Donnelly, P., Cardon, L.R., Clarke, G., Evans, D.M., Morris, A.P., Weir, B.S., Tsunoda, T., Mullikin, J.C., Sherry, S.T., Feolo, M., Skol, A., Zhang, H., Zeng, C., Zhao, H., Matsuda, I., Fukushima, Y., Macer, D.R., Suda, E., Rotimi, C.N., Adebamowo, C.A., Ajayi, I., Aniagwu, T., Marshall, P.A., Nkwodimmah, C., Royal, C.D., Leppert, M.F., Dixon, M., Peiffer, A., Qiu, R., Kent, A., Kato, K., Niikawa, N., Adewole, I.F., Knoppers, B.M., Foster, M.W., Clayton, E.W., Watkin, J., Gibbs, R.A., Belmont, J.W., Muzny, D., Nazareth, L., Sodergren, E., Weinstock, G.M., Wheeler, D.A., Yakub, I., Gabriel, S.B., Onofrio, R.C., Richter, D.J., Ziaugra, L., Birren, B.W., Daly, M.J., Altshuler, D., Wilson, R.K., Fulton, L.L., Rogers, J., Burton, J., Carter, N.P., Clee, C.M., Griffiths, M., Jones, M.C., McLay, K., Plumb, R.W., Ross, M.T., Sims, S.K., Willey, D.L., Chen, Z., Han, H., Kang, L., Godbout, M., Wallenburg, J.C., L'Archeveque, P., Bellemare, G., Saeki, K., Wang, H., An, D., Fu, H., Li, Q., Wang, Z., Wang, R., Holden, A.L., Brooks, L.D., McEwen, J.E., Guyer, M.S., Wang, V.O., Peterson, J.L., Shi, M., Spiegel, J., Sung, L.M., Zacharia, L.F., Collins, F.S., Kennedy, K., Jamieson, R., and Stewart, J. 2007. A second generation human haplotype map of over 3.1 million SNPs. Nature 449:851‐861.
   Kapushesky, M., Emam, I., Holloway, E., Kurnosov, P., Zorin, A., Malone, J., Rustici, G., Williams, E., Parkinson, H., and Brazma, A. 2010. Gene expression atlas at the European bioinformatics institute. Nucleic Acids Res. 38:D690‐D698.
   Karolchik, D., Hinrichs, A.S., and Kent, W.J. 2009. The UCSC genome browser. Curr. Protoc. Bioinform. 28:1.4.1‐1.4.26.
   Kent, W.J. 2002. BLAT: The BLAST‐like alignment tool. Genome Res. 12:656‐664.
   Letunic, I., Doerks, T., and Bork, P. 2009. SMART 6: Recent updates and new developments. Nucleic Acids Res. 37:D229‐D232.
   Levy, S., Sutton, G., Ng, P.C., Feuk, L., Halpern, A.L., Walenz, B.P., Axelrod, N., Huang, J., Kirkness, E.F., Denisov, G., Lin, Y., MacDonald, J.R., Pang, A.W., Shago, M., Stockwell, T.B., Tsiamouri, A., Bafna, V., Bansal, V., Kravitz, S.A., Busam, D.A., Beeson, K.Y., McIntosh, T.C., Remington, K.A., Abril, J.F., Gill, J., Borman, J., Rogers, Y.H., Frazier, M.E., Scherer, S.W., Strausberg, R.L., and Venter, J.C. 2007. The diploid genome sequence of an individual human. PLoS Biol. 5:e254.
   Lucotte, G. and Dieterlen, F. 2003. A European allele map of the C282Y mutation of hemochromatosis: Celtic versus viking origin of the mutation? Blood Cells Mol. Dis. 31:262‐267.
   McDowall, J., and Hunter, S. 2011. InterPro protein classification. Methods Mol. Biol. 694:37‐47.
   Nikolaev, L.G., Akopov, S.B., Didych, D.A., and Sverdlov, E.D. 2009. Vertebrate protein CTCF and its multiple roles in a large‐scale regulation of genome activity. Cur. Genomics 10:294‐302.
   Parkinson, H., Kapushesky, M., Kolesnikov, N., Rustici, G., Shojatalab, M., Abeygunawardena, N., Berube, H., Dylag, M., Emam, I., Farne, A., Holloway, E., Lukk, M., Malone, J., Mani, R., Pilicheva, E., Rayner, T.F., Rezwan, F., Sharma, A., Williams, E., Bradley, X.Z., Adamusiak, T., Brandizi, M., Burdett, T., Coulson, R., Krestyaninova, M., Kurnosov, P., Maguire, E., Neogi, S.G., Rocca‐Serra, P., Sansone, S.A., Sklyar, N., Zhao, M., Sarkans, U., and Brazma, A. 2009. ArrayExpress update: From an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Res. 37:D868‐D872.
   Paten, B., Herrero, J., Beal, K., Fitzgerald, S. and Birney, E. 2008. Enredo and Pecan: Genome‐wide mammalian consistency‐based multiple alignment with paralogs. Genome Res. 18:1814‐1828.
   Ponten, F., Jirstrom, K., and Uhlen, M. 2008. The Human Protein Atlas: A tool for pathology. J. Pathol. 216:387‐393.
   Pruitt, K.D., Harrow, J., Harte, R.A., Wallin, C., Diekhans, M., Maglott, D.R., Searle, S., Farrell, C.M., Loveland, J.E., Ruef, B.J., Hart, E., Suner, M.M., Landrum, M.J., Aken, B., Ayling, S., Baertsch, R., Fernandez‐Banet, J., Cherry, J.L., Curwen, V., Dicuccio, M., Kellis, M., Lee, J., Lin, M.F., Schuster, M., Shkeda, A., Amid, C., Brown, G., Dukhanina, O., Frankish, A., Hart, J., Maidak, B.L., Mudge, J., Murphy, M.R., Murphy, T., Rajan, J., Rajput, B., Riddick, L.D., Snow, C., Steward, C., Webb, D., Weber, J.A., Wilming, L., Wu, W., Birney, E., Haussler, D., Hubbard, T., Ostell, J., Durbin, R., and Lipman, D. 2009a. The consensus coding sequence (CCDS) project: Identifying a common protein‐coding gene set for the human and mouse genomes. Genome Res. 19:1316‐1323.
   Pruitt, K.D., Tatusova, T., Klimke, W., and Maglott, D.R. 2009b. NCBI reference sequences: Current status, policy and new initiatives. Nucleic Acids Res. 37:D32‐D36.
   Rakyan, V.K., Down, T.A., Thorne, N.P., Flicek, P., Kulesha, E., Graf, S., Tomazou, E.M., Backdahl, L., Johnson, N., Herberth, M., Howe, K.L., Jackson, D.K., Miretti, M.M., Fiegler, H., Marioni, J.C., Birney, E., Hubbard, T.J., Carter, N.P., Tavare, S., and Beck, S. 2008. An integrated resource for genome‐wide identification and analysis of human tissue‐specific differentially methylated regions (tDMRs). Genome Res. 18:1518‐1529.
   Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., Dicuccio, M., Federhen, S., Feolo, M., Geer, L.Y., Helmberg, W., Kapustin, Y., Landsman, D., Lipman, D.J., Lu, Z., Madden, T.L., Madej, T., Maglott, D.R., Marchler‐Bauer, A., Miller, V., Mizrachi, I., Ostell, J., Panchenko, A., Pruitt, K.D., Schuler, G.D., Sequeira, E., Sherry, S.T., Shumway, M., Sirotkin, K., Slotta, D., Souvorov, A., Starchenko, G., Tatusova, T.A., Wagner, L., Wang, Y., John Wilbur, W., Yaschenko, E., and Ye, J. 2009. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 38:D5‐D16.
   Searle, S., Frankish, A., Bignell, A., Aken, B., Derrien, T., Diekhans, M., Harte, R., Howald, C., Kokocinski, F., Lin, M., Tress, M., Van Baren, M., Barnes, I., Hunt, T., Carvalho‐Silva, D., Davidson, C., Donaldson, S., Gilbert, J., Kay, M., Lloyd, D., Loveland, J., Mudge, J., Snow, C., Vamathevan, J., Wilming, L., Brent, M., Gerstein, M., Guigó, R., Kellis, M., Reymond, A., Zadissa, A., Valencia, A., Harrow, J., and Hubbard, T. 2010. The GENCODE human gene set. Genome Biol. 11:P36.
   Sigrist, C.J., Cerutti, L., de Castro, E., Langendijk‐Genevaux, P.S., Bulliard, V., Bairoch, A., and Hulo, N. 2010. PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Res. 38:D161‐D166.
   Sterk, P., Kulikova, T., Kersey, P., and Apweiler, R. 2007. The EMBL nucleotide sequence and genome reviews databases. Methods Mol. Biol. 406:1‐21.
   UniProt Consortium. 2010. The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res 38:D142‐D148.
   Visel, A., Minovitsky, S., Dubchak, I., and Pennacchio, L.A. 2007. VISTA enhancer browser: A database of tissue‐specific human enhancers. Nucleic Acids Res. 35:D88‐D92.
   Wheeler, D.A., Srinivasan, M., Egholm, M., Shen, Y., Chen, L., McGuire, A., He, W., Chen, Y.J., Makhijani, V., Roth, G.T., Gomes, X., Tartaro, K., Niazi, F., Turcotte, C.L., Irzyk, G.P., Lupski, J.R., Chinault, C., Song, X.Z., Liu, Y., Yuan, Y., Nazareth, L., Qin, X., Muzny, D.M., Margulies, M., Weinstock, G.M., Gibbs, R.A., and Rothberg, J.M. 2008. The complete genome of an individual by massively parallel DNA sequencing. Nature 452:872‐876.
   Wilming, L.G., Gilbert, J.G., Howe, K., Trevanion, S., Hubbard, T., and Harrow, J.L. 2008. The vertebrate genome annotation (Vega) database. Nucleic Acids Res. 36:D753‐D760.
   Wilson, D., Pethica, R., Zhou, Y., Talbot, C., Vogel, C., Madera, M., Chothia, C., and Gough, J. 2009. SUPERFAMILY: Sophisticated comparative genomics, data mining, visualization and phylogeny. Nucleic Acids Res. 37:D380‐D386.
Internet Resources
  Ensembl project home page.
  Support videos and other tutorials for Ensembl.
  BioMart Project.
  Distributed Annotation System (DAS) and BioDAS.
  dbSNP: a repository of polymorphisms.
  Gene Ontology Consortium.
  Genome Reference Consortium: houses the reference human genome.
  NCBI GWAS catalog.
  An international organization working towards a haplotype map of the human genome.
  HUGO Gene Nomenclature Committee (HGNC).
  InterPro, a collection of protein signatures.
  Online Mendelian Inheritance in Man, a set of human genes and phenotypes.
  A multi‐organism, nonredundant database of sequences.
  UniProtKB, a catalog of information on proteins.
  UniSTS, databank for chromosomal markers.
  Vertebrate Genome Annotation (VEGA) at Sanger Institute.
  International Human Genome Sequencing Consortium.
PDF or HTML at Wiley Online Library