Obtaining Comparative Genomic Data with the VISTA Family of Computational Tools

Igor Ratnere1, Inna Dubchak1

1 Lawrence Berkeley National Laboratory, Berkeley, California
Publication Name:  Current Protocols in Bioinformatics
Unit Number:  Unit 10.6
DOI:  10.1002/0471250953.bi1006s26
Online Posting Date:  June, 2009
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

Comparison of DNA sequences from different species is a fundamental method for identifying functional elements, such as exons or enhancers, as they tend to exhibit significant sequence similarity due to purifying selection. Availability of whole‐genome sequences for a constantly growing number of organisms makes identification of such elements within these genomes possible. There are two distinct phases in comparisons of genomic sequences: in the first, the sequences are aligned, and in the second, the resulting alignments are analyzed to find conservation signals that may be indicative of functional regions. Due to the considerable length of alignments, good visual representation techniques are a necessity for effective isolation of regions of interest. The VISTA family of tools provides biomedical investigators with a unified framework for the alignment of long genomic sequences and whole‐genome assemblies, interactive visual analysis of alignments along with functional annotation, and many other comparative genomics capabilities. Curr. Protoc. Bioinform. 26:10.6.1‐10.6.17. © 2009 by John Wiley & Sons, Inc.

Keywords: comparative genomics; DNA alignment; VISTA; genome browser

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Analyzing Comparative Genomic Data with the VISTA Browser
  • Basic Protocol 2: Browsing the Alignment and Retrieving SNP Information Using Base‐Pair Level Alignment Panel
  • Basic Protocol 3: Obtaining Detailed Comparative Data Including Genomic Coordinates and Parameters of Conserved Regions with the Text Browser
  • Basic Protocol 4: Finding Candidate Orthologous Regions on a Base Genome Using the GenomeVISTA Server
  • Basic Protocol 5: Finding Putative Transcription Factor Binding Sites (TFBS) Using rVISTA Server
  • Basic Protocol 6: Finding Experimentally Verified Enhancers Using the VISTA Enhancer Browser
  • Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

Literature Cited
   Bray, N., Dubchak, I., and Pachter, L. 2003. AVID: A global alignment program. Genome Res 13:97‐102.
   Brudno, M., Do, C.B., Cooper, G.M., Kim, M.F., Davydov, E., Green, E.D., Sidow, A., and Batzoglou, S. 2003a. LAGAN and Multi‐LAGAN: Efficient tools for large‐scale multiple alignment of genomic DNA. Genome Res. 13:721‐731.
   Brudno, M., Malde, S., Poliakov, A., Do, C.B., Couronne, O., Dubchak, I., and Batzoglou, S. 2003b. Global alignment: Finding rearrangements during alignment. Bioinformatics 19:i54‐i62.
   Brudno, M., Poliakov, A., Salamov, A., Cooper, G.M., Sidow, A., Rubin, E.M., Solovyev, V., Batzoglou, S., and Dubchak, I. 2004. Automated whole‐genome multiple alignment of rat, mouse, and human. Genome Res. 14:685‐692.
   Couronne, O., Poliakov, A., Bray, N., Ishkhanov, T., Ryaboy, D., Rubin, E., Pachter, L., and Dubchak, I. 2003. Strategies and tools for whole‐genome alignments. Genome Res. 13:73‐80.
   Dubchak, I., Poliakov, A., Kislyuk, A., and Brudno, M. 2009. Multiple whole‐genome alignments without a reference organism. Genome Res. 19:682‐689.
   Hubbard, T., Barker, D., Birney, E., Cameron, G., Chen, Y., Clark, L., Cox, T., Cuff, J., Curwen, V., Down, T., Durbin, R., Eyras, E., Gilbert, J., Hammond, M., Huminiecki, L., Kasprzyk, A., Lehvaslaiho, H., Lijnzaad, P., Melsopp, C., Mongin, E., Pettett, R., Pocock, M., Potter, S., Rust, A., Schmidt, E., Searle, S., Slater, G., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Stupka, E., Ureta‐Vidal, A., Vastrik, I., and Clamp, M. 2002. The Ensembl genome database project. Nucleic Acids Res. 30:38‐41.
   Karolchik, D., Kuhn, R.M., Baertsch, R., Barber, G.P., Clawson, H., Diekhans, M., Giardine, B., Harte, R.A., Hinrichs, A.S., Hsu, F., Kober, K.M., Miller, W., Pedersen, J.S., Pohl, A., Raney, B.J., Rhead, B., Rosenbloom, K.R., Smith, K.E., Stanke, M., Thakkapallayil, A., Trumbower, H., Wang, T., Zweig, A.S., Haussler, D., and Kent, W.J. 2008. The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res. 36:D773‐D779.
   Loots, G.G., Ovcharenko, I., Pachter, L., Dubchak, I., and Rubin, E.M. 2002. rVista for comparative sequence‐based discovery of functional transcription factor binding sites. Genome Res. 12:832‐839.
   Markowitz, V.M., Szeto, E., Palaniappan, K., Grechkin, Y., Chu, K., Chen, I.M., Dubchak, I., Anderson, I., Lykidis, A., Mavromatis, K., Ivanova, N.N., and Kyrpides, N.C. 2008. The integrated microbial genomes (IMG) system in 2007: Data content and analysis tool extensions. Nucleic Acids Res. 36:D528‐D533.
   Mayor, C., Brudno, M., Schwartz, J.R., Poliakov, A., Rubin, E.M., Frazer, K.A., Pachter, L.S., and Dubchak, I. 2000. VISTA: Visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16:1046‐1047.
   Ramensky, V., Bork, P., and Sunyaev, S. 2002. Human non‐synonymous SNPs: Server and survey. Nucleic Acids Res. 30:3894‐3900.
   Wheeler, D.L., Barrett, T., Benson, D.A., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Edgar, R., Federhen, S., Geer, L.Y., Helmberg, W., Kapustin, Y., Kenton, D.L., Khovayko, O., Lipman, D.J., Madden, T.L., Maglott, D.R., Ostell, J., Pruitt, K.D., Schuler, G.D., Schriml, L.M., Sequeira, E., Sherry, S.T., Sirotkin, K., Souvorov, A., Starchenko, G., Suzek, T.O., Tatusov, R., Tatusova, T.A., Wagner, L., and Yaschenko, E. 2007. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 35:D5‐D12.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library