Searching for Non‐B DNA‐Forming Motifs Using nBMST (Non‐B DNA Motif Search Tool)

R.Z. Cer1, K.H. Bruce1, D.E. Donohue1, N.A. Temiz1, U.S. Mudunuri1, M. Yi1, N. Volfovsky1, A. Bacolla2, B.T. Luke1, J.R. Collins1, R.M. Stephens1

1 Advanced Biomedical Computing Center, Information Systems Program, SAIC‐Frederick, Inc., National Cancer Institute‐Frederick, Frederick, Maryland, 2 The Dell Pediatric Research Institute, Division of Pharmacology and Toxicology, The University of Texas at Austin, Austin, Texas
Publication Name:  Current Protocols in Human Genetics
Unit Number:  Unit 18.7
DOI:  10.1002/0471142905.hg1807s73
Online Posting Date:  April, 2012
This unit describes basic protocols on using the non‐B DNA Motif Search Tool (nBMST) to search for sequence motifs predicted to form alternative DNA conformations that differ from the canonical right‐handed Watson‐Crick double‐helix, collectively known as non‐B DNA, and on using the associated PolyBrowse, a GBrowse–based genomic browser. The nBMST is a Web‐based resource that allows users to submit one or more DNA sequences to search for inverted repeats (cruciform DNA), mirror repeats (triplex DNA), direct/tandem repeats (slipped/hairpin structures), G4 motifs (tetraplex, G‐quadruplex DNA), alternating purine‐pyrimidine tracts (left‐handed Z‐DNA), and A‐phased repeats (static bending). The nBMST is versatile, simple to use, does not require bioinformatics skills, and can be applied to any type of DNA sequences, including viral and bacterial genomes, up to an aggregate of 20 megabasepairs (Mbp). Curr. Protoc. Hum. Genet. 73:18.7.1‐18.7.22. © 2012 by John Wiley & Sons, Inc.

Keywords: nBMST; non‐B DNA; nucleotide sequence analysis; G‐quadruplex; triplex; cruciform; Z‐DNA; hairpin; slipped DNA; alternative DNA structure; tandem repeats; PolyBrowse

Table of Contents

  • Introduction
  • Basic Protocol 1: Using the nBMST Server
  • Basic Protocol 2: Using the PolyBrowse Viewer
  • Commentary
  • Literature Cited
  • Figures
  • Tables
Basic Protocol 1: Using the nBMST Server

  • Computer with Internet access
  • Up‐to‐date Web browser, such as Firefox (Windows, Mac OS X, and Linux;; Safari (Windows, Mac OS X;; or Internet Explorer (Windows;
  • A text file up to 20 Mb with one or more DNA sequences in FASTA format. A FASTA file begins with a greater than sign (>) character in the header followed without any spaces by a description, and, on a new line or lines, the DNA sequence. The DNA sequences may contain only the letters A, C, G, T, or N, and uppercase and lowercase letters and spaces are allowed. If there is more than one DNA sequence, each sequence must be separated by a description line. Below are two examples of FASTA sequences. Only short sequences are shown for simplicity:
  • >seq1
  • >seq2
  • gggtgggttgggtgggg

Basic Protocol 2: Using the PolyBrowse Viewer

  • Computer with Internet access
  • An up‐to‐date Web browser, such as Firefox (Windows, Mac OS X, and Linux;; Safari (Windows, Mac OS X;; or Internet Explorer (Windows;
Internet Resources
  Non‐B DB, a database resource for integrated annotations and analysis of non‐B DNA‐forming motifs.‐bin/gb2/gbrowse/Human_37/
  PolyBrowse, ABCC genome browser for variations and annotations.
  Tandem Repeats Finder.
  QuadFinder to find cruciform DNA.
  QuadBase, a database of quadruplex motifs.
  Greglist, a database of G‐quadruplex regulated genes.
  GRSDB, a database of G‐Rich sequences.
  Quadruplex forming G‐Rich Sequences (QGRS) Mapper.
  Inverted Repeat Finder, a command line version of the IRF algorithm used to investigate inverted repeat structure of the human genome.
  Triplex Target DNA Site (TTS) Mapping.
  Triplex‐Forming Oligonucleotide Target Sequence Search program.
  The Tracts program to detect and analyze binary tracts in a DNA sequence.
  Z‐Hunt tool to find Z‐DNA.
