The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses

Gil Stelzer1, Naomi Rosen1, Inbar Plaschkes2, Shahar Zimmerman3, Michal Twik3, Simon Fishilevich3, Tsippi Iny Stein3, Ron Nudel3, Iris Lieder2, Yaron Mazor2, Sergey Kaplan2, Dvir Dahary4, David Warshawsky5, Yaron Guan‐Golan5, Asher Kohn5, Noa Rappaport3, Marilyn Safran3, Doron Lancet6

1 These authors contributed equally to the paper, 2 LifeMap Sciences Ltd, Tel Aviv, 3 Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 4 Toldot Genetics Ltd, Hod Hasharon, 5 LifeMap Sciences Inc, Marshfield, Massachusetts, 6 Corresponding author
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 1.30
DOI:  10.1002/cpbi.5
Online Posting Date:  June, 2016
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

GeneCards, the human gene compendium, enables researchers to effectively navigate and inter‐relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better‐targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene‐disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next‐generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant‐containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc.

Keywords: biological database; bioinformatics; diseases; GeneCards; gene prioritization; human genes; integrated information retrieval; next generation sequencing; VarElect

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Searching and Browsing GeneCards
  • Basic Protocol 2: Exploring a GeneCard
  • Basic Protocol 3: Using VarElect
  • Accessibility
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: Searching and Browsing GeneCards

  Necessary Resources
  • An up‐to‐date Web browser such as Google Chrome, Mozilla Firefox, Microsoft Edge, or Apple Safari

Basic Protocol 2: Exploring a GeneCard

  Necessary Resources
  • An up‐to‐date Web browser such as Google Chrome, Mozilla Firefox, Microsoft Edge, or Apple Safari

Basic Protocol 3: Using VarElect

  Necessary Resources
  • An up‐to‐date Web browser such as Google Chrome, Mozilla Firefox, Microsoft Edge, or Apple Safari
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

  Belinky, F., Bahir, I., Stelzer, G., Zimmerman, S., Rosen, N., Nativ, N., Dalah, I., Iny Stein, T., Rappaport, N., Mituyama, T., Safran, M., and Lancet, D. 2013. Non‐redundant compendium of human ncRNA genes in GeneCards. Bioinformatics 29:255‐261.
  Belinky, F., Nativ, N., Stelzer, G., Zimmerman, S., Iny Stein, T., Safran, M., and Lancet, D. 2015. PathCards: Multi‐source consolidation of human biological pathways. Database (Oxford) 2015:bav006. doi: 10.1093/database/bav006.
  Ben‐Ari Fuchs, S., Lieder, I., Stelzer, G., Mazor, Y., Buzhor, E., Kaplan, S., Bogoch, Y., Plaschkes, I., Shitrit, A., Rappaport, N., Kohn, A., Edgar, R., Shenhav, L., Safran, M., Lancet, D., Guan‐Golan, Y., Warshawsky, D., and Strichman, R. 2016. GeneAnalytics: An integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data. OMICS 20:139‐151.
  Binder, J.X., Pletscher‐Frankild, S., Tsafou, K., Stolte, C., O'Donoghue, S.I., Schneider, R., and Jensen, L.J. 2014. COMPARTMENTS: Unification and visualization of protein subcellular localization evidence. Database (Oxford) 2014:bau012. doi: 10.1093/database/bau012.
  Brown, G.R., Hem, V., Katz, K.S., Ovetsky, M., Wallin, C., Ermolaeva, O., Tolstoy, I., Tatusova, T., Pruitt, K.D., Maglott, D.R., and Murphy, T.D. 2015. Gene: A gene‐centered information resource at NCBI. Nucleic Acids Res. 43:D36‐D42. doi: 10.1093/nar/gku1055.
  Coordinators, N.R. 2015. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 43:D6‐D17. doi: 10.1093/nar/gku1130.
  Cunningham, F., Amode, M.R., Barrell, D., Beal, K., Billis, K., Brent, S., Carvalho‐Silva, D., Clapham, P., Coates, G., Fitzgerald, S., Gil, L., Giron, C.G., Gordon, L., Hourlier, T., Hunt, S.E., Janacek, S.H., Johnson, N., Juettemann, T., Kahari, A.K., Keenan, S., Martin, F.J., Maurel, T., McLaren, W., Murphy, D.N., Nag, R., Overduin, B., Parker, A., Patricio, M., Perry, E., Pignatelli, M., Riat, H.S., Sheppard, D., Taylor, K., Thormann, A., Vullo, A., Wilder, S.P., Zadissa, A., Aken, B.L., Birney, E., Harrow, J., Kinsella, R., Muffato, M., Ruffier, M., Searle, S.M., Spudich, G., Trevanion, S.J., Yates, A., Zerbino, D.R., and Flicek, P. 2015. Ensembl 2015. Nucleic Acids Res. 43:D662‐D669. doi: 10.1093/nar/gku1010.
  Edgar, R., Mazor, Y., Rinon, A., Blumenthal, J., Golan, Y., Buzhor, E., Livnat, I., Ben‐Ari, S., Lieder, I., Shitrit, A., Gilboa, Y., Ben‐Yehudah, A., Edri, O., Shraga, N., Bogoch, Y., Leshansky, L., Aharoni, S., West, M.D., Warshawsky, D., and Shtrichman, R. 2013. LifeMap Discovery: The embryonic development, stem cells, and regenerative medicine research portal. PLoS One 8:e66629. doi: 10.1371/journal.pone.0066629.
  Fishilevich, S., Zimmerman, S., Kohn, A., Iny‐Stein, T., Olender, T., Kolker, E., Safran, M., and Doron, L. 2016. Genic insights from integrated human proteomics in GeneCards. Database (Oxford).
  Gray, K.A., Yates, B., Seal, R.L., Wright, M.W., and Bruford, E.A. 2015. Genenames.org: The HGNC resources in 2015. Nucleic Acids Res. 43:D1079‐D1085. doi: 10.1093/nar/gku1071.
  Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., and McKusick, V.A. 2005. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33:D514‐D517. doi: 10.1093/nar/gki033.
  Harel, A., Inger, A., Stelzer, G., Strichman‐Almashanu, L., Dalah, I., Safran, M., and Lancet, D. 2009. GIFtS: Annotation landscape analysis with GeneCards. BMC Bioinformatics 10:348. doi: 10.1186/1471‐2105‐10‐348.
  Kononenko, O., Baysal, O., Holmes, R., and Godfrey, M.W. 2014. Mining modern repositories with elasticsearch. Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 328‐331. Hyderabad, India.
  Lonsdale, J., Thomas, J., Salvatore, M., Phillips, R., Lo, E., Shad, S., Hasz, R., Walters, G., Garcia, F., Young, N., Foster, B., Moser, M., Karasik, E., Gillard, B., Ramsey, K., Sullivan, S., Bridge, J., Magazine, H., Syron, J., Fleming, J., Siminoff, L., Traino, H., Mosavel, M., Barker, L., Jewell, S., Rohrer, D., Maxim, D., Filkins, D., Harbach, P., Cortadillo, E., Berghuis, B., Turner, L., Hudson, E., Feenstra, K., Sobin, L., Robb, J., Branton, P., Korzeniewski, G., Shive, C., Tabor, D., Qi, L., Groch, K., Nampally, S., Buia, S., Zimmerman, A., Smith, A., Burges, R., Robinson, K., Valentino, K., Bradbury, D., Cosentino, M., Diaz‐Mayoral, N., Kennedy, M., Engel, T., Williams, P., Erickson, K., Ardlie, K., Winckler, W., Getz, G., DeLuca, D., MacArthur, D., Kellis, M., Thomson, A., Young, T., Gelfand, E., Donovan, M., Meng, Y., Grant, G., Mash, D., Marcus, Y., Basile, M., Liu, J., Zhu, J., Tu, Z., Cox, N.J., Nicolae, D.L., Gamazon, E.R., Im, H.K., Konkashbaev, A., Pritchard, J., Stevens, M., Flutre, T., Wen, X., Dermitzakis, E.T., Lappalainen, T., Guigo, R., Monlong, J., Sammeth, M., Koller, D., Battle, A., Mostafavi, S., McCarthy, M., Rivas, M., Maller, J., Rusyn, I., Nobel, A., Wright, F., Shabalin, A., Feolo, M., Sharopova, N., Sturcke, A., Paschal, J., Anderson, J.M., Wilder, E.L., Derr, L.K., Green, E.D, Struewing, J.P., Temple, G., Volpi, S., Boyer, J.T., Thomson, E.J., Guyer, M.S., Ng, C., Abdallah, A., Colantuoni, D., Insel, T.R., Koester, S.E., Little, A.R., Bender, P.K., Lehner, T., Yao, Y., Compton, C.C., Vaught, J.B, Sawyer, S., Lockhart, N.C, Demchok, J., Moore, H.F., 2013. The Genotype‐Tissue Expression (GTEx) project. Nat. Genet. 45:580‐585. doi: 10.1038/ng.2653.
  Oz‐Levi, D., Weiss, B., Lahad, A., Greenberger, S., Pode‐Shakked, B., Somech, R., Olender, T., Tatarsky, P., Marek‐Yagel, D., Pras, E., Anikster, Y., and Lancet, D. 2015. Exome sequencing as a differential diagnosis tool: Resolving mild trichohepatoenteric syndrome. Clin. Genet. 87:602‐603. doi: 10.1111/cge.12494.
  Perfetto, L., Briganti, L., Calderone, A., Perpetuini, A.C., Iannuccelli, M., Langone, F., Licata, L., Marinkovic, M., Mattioni, A., Pavlidou, T., Peluso, D., Petrilli, L.L., Pirro, S., Posca, D., Santonico, E., Silvestri, A., Spada, F., Castagnoli, L., and Cesareni, G. 2016. SIGNOR: A database of causal relationships between biological entities. Nucleic Acids Res. 44:D548‐D554. doi: 10.1093/nar/gkv1048.
  Rappaport, N., Twik, M., Nativ, N., Stelzer, G., Bahir, I., Stein, T.I., Safran, M., and Lancet, D. 2014. MalaCards: A comprehensive automatically‐mined database of human diseases. Curr. Protoc. Bioinform. 47:1.24.1‐1.24.19.
  Rappaport, N., Nativ, N., Stelzer, G., Twik, M., Guan‐Golan, Y., Stein, T.I., Bahir, I., Belinky, F., Morrey, C.P., Safran, M., and Lancet, D. 2013. MalaCards: An integrated compendium for diseases and their annotation. Database (Oxford) 2013:bat018. doi: 10.1093/database/bat018.
  Rosen, N., Chalifa‐Caspi, V., Shmueli, O., Adato, A., Lapidot, M., Stampnitzky, J., Safran, M., and Lancet, D. 2003. GeneLoc: Exon‐based integration of human genome maps. Bioinformatics 19 Suppl 1:i222‐i224. doi: 10.1093/bioinformatics/btg1030.
  Safran, M., Dalah, I., Alexander, J., Rosen, N., Iny Stein, T., Shmoish, M., Nativ, N., Bahir, I., Doniger, T., Krug, H., Sirota‐Madi, A., Olender, T., Golan, Y., Stelzer, G., Harel, A., and Lancet, D. 2010. GeneCards Version 3: The human gene integrator. Database (Oxford) 2010:baq020. doi: 10.1093/database/baq020.
  Stelzer, G., Inger, A., Olender, T., Iny‐Stein, T., Dalah, I., Harel, A., Safran, M., and Lancet, D. 2009. GeneDecks: Paralog hunting and gene‐set distillation with GeneCards annotation. OMICS 13:477‐487. doi: 10.1089/omi.2009.0069.
  Stelzer, G., Plaschkes, I., Oz‐Levi, D., Alkelai, A., Olender, T., Zimmerman, S., Twik, M., Belinky, F., Fishilevich, S., Nudel, R., Guan‐Golan, Y., Warshawsky, D., Dahary, D., Kohn, A., Mazor, Y., Kaplan, S., Iny Stein, T., Baris, H.N., Rappaport, N., Safran, M., and Lancet, D. 2016. VarElect: the phenotype‐based variation prioritizer of the GeneCards Suite. BMC Genomics. in press.
  Wu, C., Orozco, C., Boyer, J., Leglise, M., Goodale, J., Batalov, S., Hodge, C.L., Haase, J., Janes, J., Huss, J.W., 3rd, and Su, A.I. 2009. BioGPS: An extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol. 10:R130. doi: 10.1186/gb‐2009‐10‐11‐r130.
Internet Resources
  http://www.genecards.org/
  GeneCards is a compendium of human genes that provides genomic, proteomic, transcriptomic, genetic, and functional information on all known and predicted human genes. It is developed and maintained by the Lancet lab in the Department of Molecular Genetics at the Weizmann Institute of Science.
  http://varelect.genecards.org
  VarElect is a cutting‐edge Variant Election application for disease/phenotype‐dependent gene variant prioritization. It is an effective and user friendly tool for analyzing genes with variants following Next‐Generation Sequencing (“NGS”) experiments. VarElect can rapidly prioritize genes that have been found to have variants according to selected disease/phenotype‐gene associations.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library