Using Weeder, Pscan, and PscanChIP for the Discovery of Enriched Transcription Factor Binding Site Motifs in Nucleotide Sequences

Federico Zambelli1, Graziano Pesole2, Giulio Pavesi3

1 Istituto di Biomembrane e Bioenergetica, Consiglio Nazionale delle Ricerche, Bari, 2 Dipartimento di Bioscienze, Biotecnologie e Biofarmaceutica, Università di Bari, 3 Dipartimento di Bioscienze, Università di Milano
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 2.11
DOI:  10.1002/0471250953.bi0211s47
Online Posting Date:  September, 2014
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library


One of the greatest challenges facing modern molecular biology is understanding the complex mechanisms regulating gene expression. A fundamental step in this process requires the characterization of sequence motifs involved in the regulation of gene expression at transcriptional and post‐transcriptional levels. In particular, transcription is modulated by the interaction of transcription factors (TFs) with their corresponding binding sites. Weeder, Pscan, and PscanChIP are software tools freely available for noncommercial users as a stand‐alone or Web‐based applications for the automatic discovery of conserved motifs in a set of DNA sequences likely to be bound by the same TFs. Input for the tools can be promoter sequences from co‐expressed or co‐regulated genes (for which Weeder and Pscan are suitable), or regions identified through genome wide ChIP‐seq or similar experiments (Weeder and PscanChIP). The motifs are either found by a de novo approach (Weeder) or by using descriptors of the binding specificity of TFs (Pscan and PscanChIP). Curr. Protoc. Bioinform. 47:2.11.1‐2.11.31. © 2014 by John Wiley & Sons, Inc.

Keywords: transcription regulation; gene expression; transcription factor binding sites; chromatin immunoprecipitation; ChIP‐Seq

PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Finding Enriched TFBSs Using Weeder 2.0
  • Support Protocol 1: Obtaining and Installing Weeder
  • Basic Protocol 2: Finding Enriched TFBSs in Promoter Sequences Using Pscan
  • Basic Protocol 3: Finding Enriched TFBSs in Chip‐seq Regions Using PscanChIP
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
PDF or HTML at Wiley Online Library


PDF or HTML at Wiley Online Library



Literature Cited

  Crooks, G.E., Hon, G., Chandonia, J.M., and Brenner, S.E. 2004. WebLogo: A sequence logo generator. Genome Res. 14:1188‐1190.
  Do, C.B. and Batzoglou, S. 2008. What is the expectation maximization algorithm? Nat. Biotechnol. 26:897‐899.
  Fleming, J.D., Pavesi, G., Benatti, P., Imbriano, C., Mantovani, R., and Struhl, K. 2013. NF‐Y coassociates with FOS at promoters, enhancers, repetitive elements, and inactive chromatin regions, and is stereo‐positioned with growth‐controlling transcription factors. Genome Res. 23:1195‐1209.
  Huang, D.W., Sherman, B.T., and Lempicki, R.A. 2009. Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. Nat. Protoc. 4:44‐57.
  Kent, W.J., Hsu, F., Karolchik, D., Kuhn, R.M., Clawson, H., Trumbower, H., and Haussler, D. 2005. Exploring relationships and mining data with the UCSC Gene Sorter. Genome Res. 15:737‐741.
  Mahony, S. and Benos, P.V. 2007. STAMP: A web tool for exploring DNA‐binding motif similarities. Nucleic Acids Res. 35:W253‐W258.
  Mathelier, A., Zhao, X., Zhang, A.W., Parcy, F., Worsley‐Hunt, R., Arenillas, D.J., Buchman, S., Chen, C.Y., Chou, A., Ienasescu, H., Lim, J., Shyr, C., Tan, G., Zhou, M., Lenhard, B., Sandelin, A., and Wasserman, W.W. 2014. JASPAR 2014: An extensively expanded and updated open‐access database of transcription factor binding profiles. Nucleic Acids Res. 42:D142‐D147.
  Matys, V., Kel‐Margoulis, O.V., Fricke, E., Liebich, I., Land, S., Barre‐Dirrie, A., Reuter, I., Chekmenev, D., Krull, M., Hornischer, K., Voss, N., Stegmaier, P., Lewicki‐Potapov, B., Saxel, H., Kel, A.E., and Wingender, E. 2006. TRANSFAC and its module TRANSCompel: Transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34:D108‐D110.
  Pavesi, G., Mereghetti, P., Mauri, G., and Pesole, G. 2004. Weeder Web: discovery of transcription factor binding sites in a set of sequences from co‐regulated genes. Nucleic Acids Res. 32(Web Server issue):W199‐203.
  Pepke, S., Wold, B., and Mortazavi, A. 2009. Computation for ChIP‐seq and RNA‐seq studies. Nat. Methods 6:S22‐S32.
  Schneider, T.D. and Stephens, R.M. 1990. Sequence logos: A new way to display consensus sequences. Nucleic Acids Res. 18:6097‐6100.
  Stormo, G.D. 2000. DNA binding sites: Representation and discovery. Bioinformatics 16:16‐23.
  Whitfield, M.L., Sherlock, G., Saldanha, A.J., Murray, J.I., Ball, C.A., Alexander, K.E., Matese, J.C., Perou, C.M., Hurt, M.M., Brown, P.O., and Botstein, D. 2002. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell 13:1977‐2000.
  Zambelli, F., Pesole, G., Pavesi, G. 2009. Pscan: Finding over‐represented transcription factor binding site motifs in sequences from co‐regulated or co‐expressed genes. Nucleic Acids Res. 37:W247‐W252.
  Zambelli, F., Pesole, G., and Pavesi, G. 2012a. Motif discovery and transcription factor binding sites before and after the next‐generation sequencing era. Brief. Bioinform. 14:225‐237.
  Zambelli, F., Prazzoli, G.M., Pesole, G., and Pavesi, G. 2012b. Cscan: Finding common regulators of a set of genes by using a collection of genome‐wide ChIP‐seq datasets. Nucleic Acids Res. 40:W510‐W515.
  Zambelli, F., Pesole, G., and Pavesi, G. 2013. PscanChIP: Finding over‐represented transcription factor‐binding site motifs and their correlations in sequences from ChIP‐Seq experiments. Nucleic Acids Res. 41:W535‐W543.
Internet Resources
  Web site for downloading Weeder and accessing the Pscan/PscanChIP Web interfaces.
  UCSC genome browser.
  The TRANSFAC database (commercial, subscription required).
  The JASPAR database.
  Web site for running the STAMP tool.
PDF or HTML at Wiley Online Library