User Ratings

Your rating: None (3 votes)
Your rating: None (3 votes)
Your rating: None (2 votes)
Add your comments

An Introduction to Sequence Similarity (“Homology”) Searching

Gary D. Stormo1

1Washington University, School of Medicine, St. Louis, Missouri

Unit Number: 
Unit 3.1
DOI: 
10.1002/0471250953.bi0301s27
Online Posting Date: 
September, 2009
GO TO THE FULL TEXT:
PDF or HTML at Wiley Online Library
Are you the author of this protocol? Login or register and return to this page.

Abstract

Homologous sequences usually have the same, or very similar, functions, so new sequences can be reliably assigned functions if homologous sequences with known functions can be identified. Homology is inferred based on sequence similarity, and many methods have been developed to identify sequences that have statistically significant similarity. This unit provides an overview of some of the basic issues in identifying similarity among sequences and points out other units in this chapter that describe specific programs that are useful for this task. Curr. Protoc. Bioinform. 27:3.1.1-3.1.7. © 2009 by John Wiley & Sons, Inc.

Keywords: sequence similarity; homology; dynamic programming; similarity-scoring matrices; sequence alignment; multiple alignment; sequence evolution

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • An Introduction to Identifying Homologous Sequences
  • Optimal Sequence Alignments
  • Scoring Sequence Similarity
  • Fast Searching Methods
  • The Significance of an Alignment Score
  • Making and Using Multiple Sequence Alignments
  • Summary
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  • Figure 3.1.1
    Dynamic programming algorithm for optimum sequence alignment. The two sequences are written across the top and along the right side of the matrix. The score of each element is determined by the simple rules shown for the enlarged section and described by the equations below it. (The top row and left column have special rules as described in the text.) The score of the best global alignment is the element S(n,m) and the alignment with that score can be obtained by backtracking through the matrix, determining the path that generated the score at each element.

Literature Cited

Literature Cited
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410.
    Eddy, S.R. 1996. Hidden Markov models. Curr. Opin. Struct. Biol. 6:361-365.
    Fitch, W. and Smith, T. 1983. Optimal sequence alignments. Proc. Natl. Acad. Sci. U.S.A. 80:1382-1386.
    Gribskov, M., McLachlan, A.D., and Eisenberg, D. 1987. Profile analysis: Detection of distantly related proteins. Proc. Natl. Acad. Sci. U.S.A. 84:4355-4358.
    Karlin, S. and Altschul, S.F. 1990. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. U.S.A. 87:2264-2268.
    Krogh, A., Brown, M., Mian, I.S., Sjölander, K., and Haussler, D. 1994. Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235:1501-1531.
    Lipman, D.J., Altschul, S.F., and Kececioglu, J.D. 1989. A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. U.S.A. 86:4412-4415.
    Smith, T.F. and Waterman, M.S. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147:195-197.
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
Looking for Answers?
Do you have tips, tricks, or improvements to share?

Join the Conversation

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.