
| 
 Read trimming with =trimmomaticDownload and InstallDownload the "binary" version oftrimmomaticfrom the Usadel lab.
It's helpful to a create a shortcut so that you can call the command by typingtrimmomaticrather than remembering the more complicated java command line every time. You can do this by adding a line like this to your bash profile:alias trimmomatic='java -jar /Users/name/local/bin/trimmomatic-0.XX.jar'Replace the path with the path to the *.jar file and the XX with the version number.
The command line does not provide very detailed help, so keep the Manual (PDF) handy!Notes on Choosing or Creating an Adaptor FileOne of the main advantages of usingtrimmomaticis that is deals well with collapsing and trimming paired-end sequencing of fragments that are shorter than the read length, where the read extends into the adaptors at the end of each read. 
In this PE mode,trimmomaticwill collapse these sequences to a single (and now unpaired) read, so that it will not be counted as sequencing of two independent DNA molecules This is important when analyzing sequences isolated from a mixed population.
The manual didn't exactly make it clear what adaptor sequences need to be included to get this trimming to work properly. So let's assume you have constructed a library with this format.
Adaptor1-Forward‑Sequenced-DNA‑Adaptor2-ForwardAdaptor1-Complement‑Read_R1‑Adaptor2-Complement If sequencing reads through the insert into the adaptor, then your paired reads (R1 and R2) will look like this: R1: Sequenced-DNA‑Adaptor2-Forward R2: Adaptor2(Reverse Complement) - Sequenced-DNA (Reverse Complement) In this case, your adaptor file needs to include the lines: | ||||||||
| Changed: | ||||||||
| < < | >Prefix_Name/1 Adaptor1-Forward >Prefix_Name/2 Adaptor2-Forward | |||||||
| > > | Prefix_Name/1Adaptor1-Forward>Prefix_Name/2Adaptor2-Forward | |||||||
| Note: It is OK to put ambiguous bases (N) in the adaptor sequences if you have a barcode in your adaptors. Using with =gdtools | ||||||||
| Changed: | ||||||||
| < < | If you are using the gdtools RUNFILEcommand to generate your trimming commands, this will re-combine all of the R1 and R2 sequences into two files that will no longer have the same number of sequences in them (so they are not suitable for input for paired-end mapping). You can use the-poption to preserve the pairing in the output files (creating the normal 4 rather than 2 output files). | |||||||
| > > | If you are using GenomeDiffmetadata files for your sequencing samples (highly recommended), then you can use thegdtools RUNFILEcommand to generate your trimming commands | |||||||
| Added: | ||||||||
| > > | For unpaired sequencing reads, you should use gdtools RUNFILE -m trimmomatic.
For paired-end reads, the resulting commands would trim each read independently, creating trimmed R1 and R2 sequences files that would no longer have the same number of sequences in them and not collapsing R1 and R2 when they overlap one another due to a short fragment size. This would make the output unsuitable for paired-end mapping. Instead, you should generally use thegdtools RUNFILE -m trimmomatic-PE-uniquecommand to generate your trimming commands for paired-end DNAseq or RNAseq data. Using that command will deal properly with overlapping reads. You must additionally supply the-poption to preserve read pairing in the output files. This will create 4 output files (P1, P2, U1, U2) rather than 2 output files (R1, R2) of trimmed reads.gdtools RUNFILEalso recognizes theTRIM-START-BASESandTRIM-END-BASESmetadata lines in yourGenomeDiffto add thetrimmomaticoptions to remove a fixed number of bases from each end of each read. | |||||||
| 
 Read trimming with =trimmomaticDownload and InstallDownload the "binary" version oftrimmomaticfrom the Usadel lab.
It's helpful to a create a shortcut so that you can call the command by typingtrimmomaticrather than remembering the more complicated java command line every time. You can do this by adding a line like this to your bash profile:alias trimmomatic='java -jar /Users/name/local/bin/trimmomatic-0.XX.jar'Replace the path with the path to the *.jar file and the XX with the version number.
The command line does not provide very detailed help, so keep the Manual (PDF) handy!Notes on Choosing or Creating an Adaptor FileOne of the main advantages of usingtrimmomaticis that is deals well with collapsing and trimming paired-end sequencing of fragments that are shorter than the read length, where the read extends into the adaptors at the end of each read. 
In this PE mode,trimmomaticwill collapse these sequences to a single (and now unpaired) read, so that it will not be counted as sequencing of two independent DNA molecules This is important when analyzing sequences isolated from a mixed population.
The manual didn't exactly make it clear what adaptor sequences need to be included to get this trimming to work properly. So let's assume you have constructed a library with this format.
Adaptor1-Forward‑Sequenced-DNA‑Adaptor2-ForwardAdaptor1-Complement‑Read_R1‑Adaptor2-Complement If sequencing reads through the insert into the adaptor, then your paired reads (R1 and R2) will look like this: R1: Sequenced-DNA‑Adaptor2-Forward R2: Adaptor2(Reverse Complement) - Sequenced-DNA (Reverse Complement) In this case, your adaptor file needs to include the lines: >Prefix_Name/1 Adaptor1-Forward >Prefix_Name/2 Adaptor2-Forward | ||||||||
| Changed: | ||||||||
| < < | Note: It is OK to put ambiguous bases (N) in the adaptor sequences if you have a bar code in your adaptors. | |||||||
| > > | Note: It is OK to put ambiguous bases (N) in the adaptor sequences if you have a barcode in your adaptors. | |||||||
| Changed: | ||||||||
| < < | Useing with =gdtools | |||||||
| > > | Using with =gdtools | |||||||
| Changed: | ||||||||
| < < | If you are using gdtools, the command to generate your trimming commands. This will re-combine all of the R1 and R2 sequences into two files that will no longer have the same number of sequences in them (so they are not suitable for input for paired-end mapping). You can use the -poption to preserve the pairing in the output files (creating the normal 4 rather than 2 output files). | |||||||
| > > | If you are using the gdtools RUNFILEcommand to generate your trimming commands, this will re-combine all of the R1 and R2 sequences into two files that will no longer have the same number of sequences in them (so they are not suitable for input for paired-end mapping). You can use the-poption to preserve the pairing in the output files (creating the normal 4 rather than 2 output files). | |||||||
| 
 Read trimming with =trimmomaticDownload and InstallDownload the "binary" version oftrimmomaticfrom the Usadel lab.
It's helpful to a create a shortcut so that you can call the command by typingtrimmomaticrather than remembering the more complicated java command line every time. You can do this by adding a line like this to your bash profile:alias trimmomatic='java -jar /Users/name/local/bin/trimmomatic-0.XX.jar'Replace the path with the path to the *.jar file and the XX with the version number.
The command line does not provide very detailed help, so keep the Manual (PDF) handy!Notes on Choosing or Creating an Adaptor FileOne of the main advantages of usingtrimmomaticis that is deals well with collapsing and trimming paired-end sequencing of fragments that are shorter than the read length, where the read extends into the adaptors at the end of each read. 
In this PE mode,trimmomaticwill collapse these sequences to a single (and now unpaired) read, so that it will not be counted as sequencing of two independent DNA molecules This is important when analyzing sequences isolated from a mixed population.
The manual didn't exactly make it clear what adaptor sequences need to be included to get this trimming to work properly. So let's assume you have constructed a library with this format.
Adaptor1-Forward‑Sequenced-DNA‑Adaptor2-ForwardAdaptor1-Complement‑Read_R1‑Adaptor2-Complement If sequencing reads through the insert into the adaptor, then your paired reads (R1 and R2) will look like this: R1: Sequenced-DNA‑Adaptor2-Forward R2: Adaptor2(Reverse Complement) - Sequenced-DNA (Reverse Complement) In this case, your adaptor file needs to include the lines: >Prefix_Name/1 Adaptor1-Forward >Prefix_Name/2 Adaptor2-Forward Note: It is OK to put ambiguous bases (N) in the adaptor sequences if you have a bar code in your adaptors. Useing with =gdtoolsIf you are using gdtools, the command to generate your trimming commands. This will re-combine all of the R1 and R2 sequences into two files that will no longer have the same number of sequences in them (so they are not suitable for input for paired-end mapping). You can use the-poption to preserve the pairing in the output files (creating the normal 4 rather than 2 output files). |