Submitting a Sequence to GenBank

Wei‐Jen Chang1, Kassandra E. Zaila1, Thomas W. Coppola1

1 Hamilton College, Clinton, New York
Publication Name:  Current Protocols Essential Laboratory Techniques
Unit Number:  Unit 11.2
DOI:  10.1002/9780470089941.et1102s12
Online Posting Date:  May, 2016
In the post‐genomic era, more and more research projects involve the generation of molecular sequence data. How should these newly obtained DNA/protein sequences be analyzed, and how should they be prepared for submission to sequence databases? In this unit, we provide guidelines and a flowchart to help first‐time users process new sequence data using third‐party freeware programs and give a step‐by‐step demonstration on the preparation of a sequence file for submission to GenBank using the Sequin program. © 2016 by John Wiley & Sons, Inc.

Keywords: annotation; BankIt; bioedit; bioinformatics; BLAST; jalview; NGS; Sequin

Table of Contents

  • Overview and Principles
  • Strategic Planning
  • Basic Protocol 1: Submitting a Novel Sequence to GenBank
  • Basic Protocol 2: Updating an Existing GenBank Record
  • Commentary
  • Figures
Basic Protocol 1: Submitting a Novel Sequence to GenBank

  • Computer running Sequin (see Strategic Planning)
Literature Cited

  Baxevanis, A.D. 2004. An overview of gene identification: Approaches, strategies, and considerations. Curr. Protoc. Bioinform. 6:4.1.1‐4.1.9. doi: 10.1002/0471250953.bi0401s6.
  Benson, D.A., Clark, K., Karsch‐Mizrachi, I., Lipman, D.J., Ostell, J., and Sayers, E.W., 2015. GenBank. Nucleic Acids Res. 43, D30‐5. doi: 10.1093/nar/gku1216.
  Hall, T.A. 1999. BioEdit: A user‐friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41:95‐98. doi: 10.1021/bk‐1999‐0734.ch008.
  Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., and Higgins, D.G. 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23:2947‐2948. doi: 10.1093/bioinformatics/btm404.
  Maddison, W.P. and Maddison, D.R. 2015. Mesquite: A modular system for evolutionary analysis. Version 3.04.
  Marchler‐Bauer, A., Anderson, J.B., Derbyshire, M.K., DeWeese‐Scott, C., Gonzales, N.R., Gwadz, M., Hao, L., He, S., Hurwitz, D.I., Jackson, J.D., Ke, Z., Krylov, D., Lanczycki, C.J., Liebert, C.A., Liu, C., Lu, F., Lu, S., Marchler, G.H., Mullokandov, M., Song, J.S., Thanki, N., Yamashita, R.A., Yin, J.J., Zhang, D., and Bryant, S.H. 2007. CDD: A conserved domain database for interactive domain family analysis. Nucleic Acids Res. 35:D237‐D240. doi: 10.1093/nar/gkl951.
  Stover, N. A. and Cavalcanti, A. R. 2014. Using NCBI BLAST. Curr. Protoc. Essen. Lab. Tech. 11:11.1.1‐11.1.35.
  Touchman, J. W. 2009. DNA sequencing: An outsourcing guide. Curr. Protoc. Essen. Lab. Tech. 2:12.1.1‐12.1.19.
  Waterhouse, A.M., Procter, J.B., Martin, D.M., Clamp, M., and Barton, G.J., 2009. Jalview Version 2–a multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189‐1191. doi: 10.1093/bioinformatics/btp033.
  Williams, M., Bozyczko‐Coyne, D., Dorsey, B., and Larsen, S. 2008. Laboratory notebooks and data storage. Curr. Protoc. Essen. Lab. Tech. 00:A.2A.1‐A.2A.28.
Internet Resources
  Conserved domain database (CDD; Marchler‐Bauer et al., ). Search for conserved motifs on your protein sequences.
  National Center for Biotechnology Information (NCBI) homepage. Gateway linked to numerous useful resources, such as GenBank, BLAST, and Entrez, among others.
  Sequin quick guide. Get detailed information on how to use Sequin.
