---+ Analyzing RNA-Seq data for differential gene expression ---++ Materials and Software * Data files * RNA-Seq data in FASTQ format (Ex: dataset1.fastq, dataset2.fastq) * Genome sequence files * Genome sequence in FASTA format (Ex: REL606.fna) * Genome sequence gene annotations in GFF3 format (Ex: REL606.gff3) * Read mapper software * BWA * [[http://bio-bwa.sourceforge.net/][Download BWA]] * To build, just type "make" in the source code directory. * To install, move the executable "bwa" to somewhere in your $PATH, like $HOME/local/bin. * For usage see the [[http://bio-bwa.sourceforge.net/bwa.shtml][BWA manual]]. * Bowtie2 * [[http://bowtie-bio.sourceforge.net][Download Bowtie[[ * To build, just type "make" in the source code directory. * Add this directory to your $PATH or move bowtie, and bowtie-* star executable to your $PATH * [[http://barricklab.org/breseq][breseq]] * R statistics package * [[http://cran.r-project.org/][Download and install R]] * Bioconductor R modules * library(edgeR) * library(DESEQ) ---++ Commands ---+++ Align reads to reference genome ---++++ Using BWA First, index your genome so BWA can map read to it: %BR% <code>$bwa index REL606.fna</code> %BR% Then, align each data set: %BR% <code>$bwa aln REL606.fna dataset1.fastq > datasetX.sai </code> %BR% And convert to SAM format (assumes single-end data): <code>$bwa samse REL606.fna datasetX.sai datasetX.fastq > datasetX.sam </code> %BR% ---++++ Using bowtie2 First, index your genome so bowtie2 can map read to it: %BR% <code>$bowtie2-build REL606.fna REL606</code> %BR% Then, align each data set: %BR% <code>$bowtie2 -x REL606 -U datasetX.fastq --phred33 -S REL606.sam</code> %BR% Optionally, add the <code>--local</code> flag if your reads do not map end-to-end. ---++ Count reads mapping to genes <code>breseq RNASEQ -f REL606.fna -r REL606.gbk -o datasetX.count.tab datasetX.sam</code> %BR% ---++ Convert alignments to BAM And convert to BAM format (assumes single-end data): %BR% <code>$samtools faidx REL606.fna </code> %BR% <code>$samtools import REL606.fna datasetX.sam datasetX.unsorted.bam </code> %BR% <code>$samtools sort datasetX.unsorted.bam datasetX </code> %BR% <code>$samtools index datasetX.bam </code> %BR% Now you can use IGV to view them. ---+++ Analyze differential gene expression library(DESEQ)
This topic: Lab
>
WebHome
>
ProtocolList
>
ProtocolsRNASeqDifferentialExpression
Topic revision: r3 - 2012-01-30 - JeffreyBarrick