iRegulon and i‐cisTarget: Reconstructing Regulatory Networks Using Motif and Track Enrichment

Annelien Verfaillie1, Hana Imrichova1, Rekins Janky1, Stein Aerts1

1 Laboratory of Computational Biology, Center for Human Genetics, KU Leuven
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 2.16
DOI:  10.1002/0471250953.bi0216s52
Online Posting Date:  December, 2015
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Gene expression profiling is often used to identify genes that are co‐expressed in a biological process or disease. Downstream analyses of co‐expressed gene sets using bioinformatics methods can reveal candidate transcription factors (TF) that co‐regulate these genes, based on the presence of shared TF binding sites. Drawing gene regulatory networks that connect TFs to their predicted target genes can uncover gene modules that implement a particular function. Here, we describe several protocols to analyze any set of co‐expressed genes using iRegulon and i‐cisTarget. These tools perform regulatory sequence analysis (motif discovery) and integrate and mine large collections of existing regulatory data, such as ChIP‐Seq, DHS‐seq, and FAIRE‐seq (track discovery). While iRegulon focuses on sets of co‐expressed genes, i‐cisTarget also analyses genomic regions as input. The following protocols describe how to install and use these tools, how to interpret the obtained results, and will thus help to create meaningful regulatory networks. © 2015 by John Wiley & Sons, Inc.

Keywords: gene regulatory networks; motif discovery; track enrichment

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Analyzing Gene Expression Data Using iRegulon
  • Support Protocol 1: Installing iRegulon
  • Basic Protocol 2: Analyzing Gene Expression Data Using i‐cisTarget
  • Alternate Protocol 1: Running i‐cisTarget with All Available Feature Databases
  • Alternate Protocol 2: Running i‐cisTarget on ChIP‐seq Peaks as Input
  • Guidelines for Understanding Results
  • Commentary
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

Literature Cited
  Aigner, K., Dampier, B., Descovich, L., Mikula, M., Sultan, A., Schreiber, M., Mikulits, W., Brabletz, T., Strand, D., Obrist, P., Sommergruber, W., Schweifer, N., Wernitznig, A., Beug, H., Foisner, R., and Eger, A. 2007. The transcription factor ZEB1 (deltaEF1) promotes tumour cell dedifferentiation by repressing master regulators of epithelial polarity. Oncogene 26:6979‐6988. doi: 10.1038/sj.onc.1210508.
  Culhane, A.C., Schröder, M.S., Sultana, R., Picard, S.C., Martinelli, E.N., Kelly, C., Haibe‐Kains, B., Kapushesky, M., St Pierre, A.A., Flahive, W., Picard, K.C., Gusenleitner, D., Papenhausen, G., O'Connor, N., Correll, M., and Quackenbush, J. 2012. GeneSigDB: A manually curated database and resource for analysis of gene expression signatures. Nucl. Acids Res. 40:D1060‐D1066. doi: 10.1093/nar/gkr901.
  Eden, E., Navon, R., Steinfeld, I., Lipson, D., and Yakhini, Z. 2009. GOrilla: A tool for discovery and visualization of enriched GO Terms in ranked gene lists. BMC Bioinformatics 10:48. doi: 10.1186/1471-2105-10-48.
  Gupta, S., Stamatoyannopoulos, J.A., Bailey, T.L., and Noble, W.S., 2007. Quantifying similarity between motifs. Genome Biol. 8:R24.
  Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y.C., Laslo, P., Cheng, J.X., Murre, C., Singh, H., and Glass, C.K. 2010. Simple combinations of lineage‐determining transcription factors prime cis‐regulatory elements required for macrophage and B cell identities. Mol. Cell 38:576‐589. doi: 10.1016/j.molcel.2010.05.004.
  Huang da, W., Sherman, B.T., and Lempicki, R.A. 2008. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4:44‐57. doi: 10.1038/nprot.2008.211.
  Imrichová, H., Hulselmans, G., Atak, Z.K., Potier, D., and Aerts, S. 2015. i‐cisTarget 2015 update: generalized cis‐regulatory enrichment analysis in human, mouse and fly. Nucleic Acids Res. 43:W57‐W64. doi: 10.1093/nar/gkv395.
  Janky, R., Verfaillie, A., Imrichová, H., Van de Sande, B., Standaert, L., Christiaens, V., Hulselmans, G., Herten, K., Naval Sanchez, M., Potier, D., Svetlichnyy, D., Kalender Atak, Z., Fiers, M., Marine, J.C., and Aerts, S. 2014. iRegulon: from a gene list to a gene regulatory network using large motif and track collections. PLoS Comput. Biol. 10:e1003731. doi: 10.1371/journal.pcbi.1003731.
  Karolchik, D., Hinrichs, A.S., and Kent, W.J. 2012. The UCSC genome browser. Curr. Protoc. Bioinform. 40:1.4:1.4.1‐1.4.33.
  Kwon, A.T., Arenillas, D.J., Hunt, R.W., and Wasserman, W.W., 2012. oPOSSUM‐3: Advanced analysis of regulatory motif over‐representation across genes or ChIP‐Seq datasets. G3: Genes Genomes Genetics 2:987‐1002. doi: full_text.
  Mahony, S., Auron, P.E., and Benos, P.V., 2007. DNA familial binding profiles made easy: Comparison of various motif alignment and clustering strategies. PLoS Comput. Biol. 3:e61. doi: 10.1371/journal.pcbi.0030061.
  Roadmap Epigenomics Consortium, Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi‐Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., Ziller, M.J., Amin, V., Whitaker, J.W., Schultz, M.D., Ward, L.D., Sarkar, A., Quon, G., Sandstrom, R.S., Eaton, M.L., Wu, Y.C., Pfenning, A.R., Wang, X., Claussnitzer, M., Liu, Y., Coarfa, C., Harris, R.A., Shoresh, N., Epstein, C.B., Gjoneska, E., Leung, D., Xie, W., Hawkins, R.D., Lister, R., Hong, C., Gascard, P., Mungall, A.J., Moore, R., Chuah, E., Tam, A., Canfield, T.K., Hansen, R.S., Kaul, R., Sabo, P.J., Bansal, M.S., Carles, A., Dixon, J.R., Farh, K.H., Feizi, S., Karlic, R., Kim, A.R., Kulkarni, A., Li, D., Lowdon, R., Elliott, G., Mercer, T.R., Neph, S.J., Onuchic, V., Polak, P., Rajagopal, N., Ray, P., Sallari, R.C., Siebenthall, K.T., Sinnott‐Armstrong, N.A., Stevens, M., Thurman, R.E., Wu, J., Zhang, B., Zhou, X., Beaudet, A.E., Boyer, L.A., De Jager, P.L., Farnham, P.J., Fisher, S.J., Haussler, D., Jones, S.J., Li, W., Marra, M.A., McManus, M.T., Sunyaev, S., Thomson, J.A., Tlsty, T.D., Tsai, L.H., Wang, W., Waterland, R.A., Zhang, M.Q., Chadwick, L.H., Bernstein, B.E., Costello, J.F., Ecker, J.R., Hirst, M., Meissner, A., Milosavljevic, A., Ren, B., Stamatoyannopoulos, J.A., Wang, T., and Kellis, M. 2015. Integrative analysis of 111 reference human epigenomes. Nature 518:317‐330. doi: 10.1038/nature14248.
  Roider, H.G., Manke, T., O'Keeffe, S., Vingron, M., and Haas, S.A., 2009. PASTAA: Identifying transcription factors associated with sets of co‐regulated genes. Bioinformatics 25:435‐442. doi: 10.1093/bioinformatics/btn627.
  Stormo, G.D. 2015. DNA motif databases and their uses. Curr. Protoc. Bioinform. 51:2.15.1‐2.15.6. doi: 10.1002/0471250953.bi0215s51
  Su, G., Morris, J.H., Demchak, B., and Bader, G.D., 2014. Biological network exploration with cytoscape 3. Curr. Protoc. Bioinformatics 47: 8.13.1‐8.13.24.
  Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., and Mesirov, J.P. 2005. Gene set enrichment analysis: A knowledge‐based approach for interpreting genome‐wide expression profiles. Proc. Natl. Acad. Sci. U.S.A. 102:15545‐15550. doi: 10.1073/pnas.0506580102.
  The ENCODE Project Consortium. 2012. An integrated encyclopedia of DNA elements in the human genome. Nature 489:57‐74. doi: 10.1038/nature11247.
  Vilella, A.J., Severin, J., Ureta‐Vidal, A., Heng, L., Durbin, R., and Birney, E. 2009. EnsemblCompara genetrees: Complete, duplication‐aware phylogenetic trees in vertebrates. Genome Res. 19:327‐335. doi: 10.1101/gr.073585.107.
  Wang, J., Duncan, D., Shi, Z., and Zhang, B., 2013. WEB‐Based GEne SeT AnaLysis Toolkit (WebGestalt): Update 2013. Nucleic Acids Res. 41:W77‐W83. doi: 10.1093/nar/gkt439.
  Xie, X., Lu, J., Kulbokas, E.J., Golub, T.R., Mootha, V., Lindblad‐Toh, K., Lander, E.S., and Kellis, M. 2005. Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals. Nature 434: 338‐345. doi: 10.1038/nature03441.
  Yan, J., Enge, M., Whitington, T., Dave, K., Liu, J., Sur, I., Schmierer, B., Jolma, A., Kivioja, T., Taipale, M., and Taipale, J. 2013. Transcription factor binding in human cells occurs in dense clusters formed around cohesin anchor sites. Cell 154:801‐813. doi: 10.1016/j.cell.2013.07.034.
  Zambelli, F., Pesole, G., and Pavesi, G., 2009. Pscan: Finding over‐represented transcription factor binding site motifs in sequences from co‐regulated or co‐expressed genes. Nucleic Acids Res. 37:W247‐W252. doi: 10.1093/nar/gkp464.
  Zambelli, F., Pesole, G., and Pavesi, G., 2014. Using weeder, pscan, and PscanChIP for the discovery of enriched transcription factor binding site motifs in nucleotide sequences. Curr. Protoc. Bioinformatics 47: 2.11.1‐2.11.31.
Internet Resources
  http://www.cytoscape.org/
  Website for downloading and understanding the Cytoscape tool.
  http://iregulon.aertslab.org/
  Website for downloading the iRegulon plugin. Also contains several example analysis and documentation on how to run an analysis.
  https://gbiomed.kuleuven.be/apps/lcb/i‐cisTarget/
  Website to run the i‐cisTarget tool. Also contains many example analyses and additional explanation on how to run an analysis.
  http://www.broadinstitute.org/gsea/msigdb/index.jsp
  The Molecular signature DataBase, containing a collection of annotated gene sets.
  https://genome.ucsc.edu/
  UCSC genome browser, used to visualize and access publicly available data.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library