Publicly Archiving Data

These locations can give you accession numbers for data that may not be easily communicated as supplementary information for a research report. An advantage of submitting to these public databases is that your data will be archived in standard formats that others can use more easily.

Submitting Sequences to GenBank

The easiest way for a few sequences is to use the BankIt web submission tool.

The Geneious submission tool does not properly format GenBank submissions as of v6.06 (@JEB).

Submitting Sequencing Reads to the SRA

External site SRA main page
External site SRA login page
External site NCBI online SRA manual

Examples of completed metadata spreadsheets:

Notes:

  • You cannot change aliases in the normal upload area, so be careful to enter them correctly the first time!
  • It is easiest to upload uncompressed FASTQ files.
  • Illumina uploads must be in Illumina 1.5+ FASTQ format, not converted to Sanger FASTQ format.
  • The flow cell number and lane are encoded in the name of every read in the FASTQ.
  • Paired-end or mate-paired FASTQ files must be interleaved (one file alternating corresponding first and second reads), rather than with all of the first reads in one file and all of the second reads in another file. The script interleave_paired_fastq.pl can construct the interleaved file.
  • The script estimate_insert_length.sh can be used to estimate the fragment size in a paired library to complete those fields.
  • Use the md5sum command to calculate the MD5 checksum for FASTQ files.

Submitting Transcriptomics Data (Differential Gene Expression)

External site NCBI online GEO manual and submission link

  • Do not create an SRA entry for the FASTQ files. This is handled within GEO!

Dryad

Dryad is especially good for submitting large data tables and analysis scripts (e.g., in R).

Topic attachments
I Attachment History Action Size Date Who Comment
Microsoft Excel Spreadsheetxlsx SRA_metadata.xlsx r1 manage 50.7 K 2020-07-03 - 01:51 JeffreyBarrick  
Microsoft Excel Spreadsheetxlsx SRA_submission_Microbe.1.0.xlsx r1 manage 18.5 K 2020-07-03 - 01:51 JeffreyBarrick  
Unix shell scriptsh estimate_insert_length.sh r1 manage 0.4 K 2013-02-17 - 17:25 JeffreyBarrick  
Texttxt interleave_paired_fastq.pl.txt r1 manage 0.6 K 2014-12-08 - 23:06 JeffreyBarrick  
Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | More topic actions

 Barrick Lab  >  ProtocolList  >  PubliclyArchivingData

Contributors to this topic Edit topic JeffreyBarrick
Topic revision: r6 - 2020-07-03 - 01:56:28 - Main.JeffreyBarrick
Lab.PubliclyArchivingData moved from Lab.ProtocolsUploadingDataToSRA on 2013-02-14 - 14:28 by Main.JeffreyBarrick -
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright ©2024 Barrick Lab contributing authors. Ideas, requests, problems? Send feedback