Sequence Databases: Integrated Information Retrieval and Data Submission

Jane M. Weisemann1, Mark S. Boguski1, B.F. Francis Ouellette2

1 National Center for Biotechnology Information, Bethesda, Maryland, 2 Centre for Molecular Medicine and Therapeutics, University of British Columbia, Vancouver, Canada
Publication Name:  Current Protocols in Molecular Biology
Unit Number:  Unit 19.2
DOI:  10.1002/0471142727.mb1902s51
Online Posting Date:  May, 2001
GO TO THE FULL TEXT: PDF or HTML at Wiley Online Library

Abstract

This unit provides an overview of biomedical information resources, focusing on sequence data, structure information, and the associated literature, and also discusses how nucleotide sequence data gets into the databases in the first place. Some specific databases covered here are MEDLINE, GenBank, and Entrez.

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction to Entrez
  • Data Submission: General Considerations
  • Submitting a Sequence to the Nucleotide Database
  • Submitting an Update or Correction to an Existing GenBank Entry
  • Submitting EST, STS, or GSS Data
  • Submitting High‐Throughput Genome Sequences (HTGS)
  • Conclusion
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

Videos

Literature Cited

Literature Cited
   Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymeropoulos, M.H., Xiao, H., Merril, C.R., Wu, A., Olde, B., Moreno, R.F., Kerlavage, A.R., McCombie, W.R., and Venter, J.C. 1991. Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 252:1651‐1656.
   Barrell, B.G. and Clark, B.F.C. 1974. Handbook of Nucleic Acid Sequences. Joynson‐Bruvvers, Oxford.
   Baxevanis, A.D., Boguski, M.S., and Ouellette, B.F.F. 1997. Computational analysis of DNA and protein sequences. In Genome Analysis: A Laboratory Manual (B. Birren, E.D. Green, S. Kapholz, R.M. Myers, and J. Roskams, eds.) pp. 533‐586. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
   Benson, D.A., Karsch‐Mizrachi, I., Lipman, D.J., Ostell, J., Rapp, B.A., Wheeler, D.L. 2000. GenBank. Nucl. Acids. Res. 28:15‐18.
   Boguski, M. and McEntyre, J. 1994. I think therefore I publish. Trends Biochem. Sci. 19:71.
   Boguski, M.S., Lowe, T.M., and Tolstoshev, C.M. 1993. dbEST: Database for “Expressed Sequence Tags”. Nature Genet. 4:332‐333.
   Church, D.M., Stotler, C.J., Rutter, J.L., Murrell, J.R., Trofatter, J.A., and Buckler, A.J. 1993. Isolation of genes from complex sources of mammalian genomic DNA using exon amplification. Nature Genet. 6:98‐105.
   Cockerill, M. 1994. A versatile tool for retrieving molecular sequences. Trends Biochem. Sci. 19:94‐96.
   Harper, R. 1994. Access to DNA and protein databases on the Internet. Current Opin. Biotechnol. 5:4‐18.
   Kahn, A.S., Wilcox, A.S., Polymeropoulos, M.H., Hopkins, J.A., Stevens, T.J., Robinson, M., Orpana, A.K., and Sikela, J.M. 1992. Single pass sequencing and physical and genetic mapping of human brain cDNAs. Nature Genet. 2:180‐185.
   Kans, J.A. and Ouellette, B.F.F. 1998. Submitting DNA sequences to the databases. In Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (A.D. Baxevanis and B.F.F. Ouellette, eds.) pp. 319‐353. John Wiley & Sons, New York.
   Okubo, K., Hori, N., Matoba, R., Niiyama, T., Fukushima, A., Kojima, Y., and Matsubara, K. 1992. Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expression. Nature Genet. 2:173‐179.
   Schuler, G.D., Epstein, J.A., Ohkawa, H., and Kans, J.A. 1996. Entrez: Molecular biology database and retrieval system. Methods Enzymol. 266:141‐162.
   Smith, M.W., Holmsen, A.L., Wei, Y.H., Peterson, M., and Evans, G.A. 1994. Genomic sequence sampling: A strategy to high resolution sequence‐based physical mapping of complex genomes. Nature Genet. 7:40‐47.
   Smith, T.F. 1990. The history of the genetic sequence databases. Genomics 6:702‐707.
   Waterston, R., Martin, C., Craxton, M., Huynh, C., Coulson, A., Hillier, L., Durbin, R., Green, P., Shownkeen, R., Halloran, N., Metzstein, M., Hawkins, T., Wilson, R., Berks, M., Du, Z., Thierry‐Mieg, J., and Sulston, J. 1992. A survey of expressed genes in Caenorhabditis elegans. Nature Genet. 1:114‐123.
Internet Resources
  e‐mail submissions: ddbjsub@ddbj.nig.ac.jp
  updates: ddbjupdt@ddbj.nig.ac.jp
  information: ddbj@ddbj.nig.ac.jp
  home page: http://www.ddbj.nig.ac.jp/
  WWW submissions: http://sakura.ddbj.nig.ac.jp/
  e‐mail submissions: datasubs@ebi.ac.uk
  updates:update@ebi.ac.uk
  information:datalib@ebi.ac.uk
  home page:http://www.ebi.ac.uk
  WWW submissions: http://www.ebi.ac.uk/Submissions/index.html
  WebIn:http://www.ebi.ac.uk/embl/Submission/webin.html
  e‐mail submissions: gb-sub@ncbi.nlm.nih.gov
  EST/GSS/STS: batch-sub@ncbi.nlm.nih.gov
  updates: update@ncbi.nlm.nih.gov
  information: info@ncbi.nlm.nih.gov
  home page: http://www.ncbi.nlm.nih.gov/
  WWW submissions: http://www.ncbi.nlm.nih.gov/Genbank/index.html
  BankIt: http://www.ncbi.nlm.nih.gov/BankIt/
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library