User Ratings

Your rating: None (3 votes)
Your rating: None (3 votes)
Your rating: None (3 votes)
Add your comments

Using TESS to Predict Transcription Factor Binding Sites in DNA Sequence

Jonathan Schug1

1University of Pennsylvania, Philadelphia, Pennsylvania

Unit Number: 
Unit 2.6
DOI: 
10.1002/0471250953.bi0206s21
Online Posting Date: 
March, 2008
GO TO THE FULL TEXT:
PDF or HTML at Wiley Online Library
Are you the author of this protocol? Login or register and return to this page.
Learn more about the author(s): 
Jonathan Schug ...

Abstract

This unit describes how to use the Transcription Element Search System (TESS). This Web site predicts transcription factor binding sites (TFBS) in DNA sequence using two different kinds of models of sites, strings and positional weight matrices. The binding of transcription factors to DNA is a major part of the control of gene expression. Transcription factors exhibit sequence-specific binding; they form stronger bonds to some DNA sequences than to others. Identification of a good binding site in the promoter for a gene suggests the possibility that the corresponding factor may play a role in the regulation of that gene. However, the sequences transcription factors recognize are typically short and allow for some amount of mismatch. Because of this, binding sites for a factor can typically be found at random every few hundred to a thousand base pairs. TESS has features to help sort through and evaluate the significance of predicted sites. Curr. Protoc. Bioinform. 21:2.6.1-2.6.15. © 2008 by John Wiley & Sons, Inc.

Keywords: transcription factor; DNA sequence; genome; promoter; gene regulation

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Basic Protocol: Predicting Transcription Factor Binding Sites
  • Guidelines for Understanding Results
  • Commentary
  • Bibliography
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  • Figure 2.6.1
    Sample of TESS site search results. The windows show (clockwise from upper left) a tabulation of predicted sites with scores and p-values, annotated sequences, and details of a weight matrix model for a predicted site.

  • Figure 2.6.2
    The title, navigation bar, and disclaimers from the TESS home page. The line of links (Home | Site Searches ) is the primary navigation bar. When these links are clicked, different links appear in the secondary navigation bar, which is the lower line (About | TESS ).

  • Figure 2.6.3
    The section of the job submission form for entering the minimal parameters for a TESS search. The circled “i” icons at the right edge of the form are links for help for each parameter.

  • Figure 2.6.4
    The section of the form for selecting which databases are included in the search and for filtering which factors are included in the search. The number buttons, 0 to 5, are short cuts to queries with the corresponding number of search terms.

  • Figure 2.6.5
    The Factor Filters expanded to select only mammalian factors. This search is case sensitive but does not require a complete match.

  • Figure 2.6.6
    The String Scoring section of the form showing the default parameters (see text for more detail).

  • Figure 2.6.7
    The Matrix Scoring and Output Control sections of the job submission form. The default is to use log-likelihood scoring and not to perform Poisson significance thresholding.

  • Figure 2.6.8
    The Expert Parameters section for adjusting the matrix smoothing, background models, and ambiguous query base handling.

  • Figure 2.6.9
    The page that appears when a job has been successfully submitted. Note the job number for future reference. Click on the URL to retrieve the results.

  • Figure 2.6.10
    This is the central results page. The top shows the results links table. The bottom is the top portion of a table that summarizes the search parameters.

  • Figure 2.6.11
    A sample of a Tabular Results page. Use the top section to navigate through the pages of tabular results. There are paging buttons and direct sort-column-sensitive landmark links. Use the middle section to control which columns appear in the table. Check the columns you want to see, then click Select. The bottom section is the table of predicted binding sites. Click on a column header to sort the table on that column.

  • Figure 2.6.12
    A sample of the Annotated Sequence results page. The sites with Ld scores better than the Secondary Log-Likelihood Deficit are indicated with double bars. Blue bars indicate hits in the forward sense; red hits are in the reverse sense.

  • Figure 2.6.13
    A sample of the Poisson Significance results page. This table lists the p-values for the number of hits observed for each model. The number of hits (N) and the estimated expected rate of random occurrence (Rate) are used to calculate the p-value. The actual threshold used taking into account both ta and td is indicated at the right.

Literature Cited

 Literature Cited
    Berg, O.G. 1990. Base-pair specificity of protein-DNA recognition: A statistical-mechanical model. Biomed. Biochim. Acta 49: 963-975.
    Chen, Q.K., Hertz, G.Z., and Stormo, G.D. 1997. PromFD 1.0: A computer program that predicts eukaryotic pol II promoters using strings and IMD matrices. Comput. Appl. Biosci. 13: 29-35.
    Day, W.H. and McMorris, F.R. 1992. Critical comparison of consensus methods for molecular sequences. Nucleic Acids Res. 20: 1093-1099.
    Fitzwater, T. and Polisky, B. 1996. A SELEX primer. Methods Enzymol. 267: 275-301.
    Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., and Haussler, A.D. 2002. The Human Genome Browser at UCSC. Genome Res. 12: 996-1006.
    Loots, G.G., Ovcharenko, I., Pachter, L., Rubin, E., and Dubchak, I. 2002. rVISTA: A high throughput comparative approach to identifying eukaryotic transcriptional regulatory elements in noncoding genomic sequences. Genome Res. 12: 832-839.
    Quandt, K., Frech, K., Karas, H., Wingender, E., and Werner, T. 1995. MatInd and MatInspector: New, fast, and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res. 23: 4878-4884.
    Schug, J. and Overton, G.C. 1997. Modeling transcription factor binding sites with Gibbs sampling and the minimum description length encoding. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5: 268-271.
    Schwartz, S., Zhang, Z., Frazer, K.A., Smit, A., Riemer, C., Bouck, J., Gibbs, R., Hardison, R., and Miller, W. 2000. PipMaker: A Web server for aligning two genomic DNA sequences. Genome Res. 10: 577-586.
    Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., and Lewis, S. 2002. The generic genome browser: A building block for a model organism system database. Genome Res. 12: 1599-1610.
    Vlieghe, D., Sandelin, A., De Bleser, P.J., Vleminckx, K., Wasserman, W.W., van Roy, F., and Lenhard, B. 2006. A new generation of jaspar, the open-access repository for transcription factor binding site profiles. Nucleic Acids Res. 34: D95-D97.
    Wingender, E., Chen, X., Fricke, E., Geffers, R., Hehl, R., Liebich, I., Krull, M., Matys, V., Michael, H., Ohnhauser, R., Pruss, M., Schacherer, F., Thiele, S., and Urbach, S. 2001. The TRANSFAC system on gene expression regulation. Nucleic Acids Res. 29: 281-283.
 Internet Resources
    http://www.pcbi.upenn.edu/tess

The TESS Web site.

    http://www.biobase.de

Web site for the company that now maintains TRANSFAC.

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
Looking for Answers?
Do you have tips, tricks, or improvements to share?

Join the Conversation

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.