Using Flexbar program to remove adapter sequences from NGS reads
Installing Flexbar (notes specific to TACC, can be updated for other systems)
- Go to Flexbar home page select the newest version (2.31 as of 1-16-13).
- Right click *_linux64.tgz and select 'copy link location'.
- Log onto TACC
- cd $WORK/src
- wget "paste link location"
- tar xvzf flexbar*.tgz
- cd "new folder"
- cp flexbar $HOME/local/bin
- vi $HOME/.profile_user
- Add the following if not already present:
- export PATH=$HOME/local/bin:$PATH
- export LD_LIBRARY_PATH=$WORK/src/flexbar_v2.31_linux64:$LD_LIBRARY_PATH
- optionally, can move flexbar to any location in your path, and can move libtbb.so.2 to any location in LD_LIBRARY_PATH
- logout
- Log back onto TACC
- flexbar -h
- If the help manual appears flexbar should be ready to use. If you get an error message see below, and check that $PATH and $LD_LIBRARY_PATH include the locations of the relevant files.
Notes for TACC
- Lonestar does not allow intel/11.1 compiler. Doing so results in 2 error messages:
- flexbar: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.11' not found (required by flexbar)
- flexbar: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.9' not found (required by flexbar)
- Fix by adding following command to .profile file located in $HOME:
Command line usage for removal of adapter sequences
Generic conservative command. Replace everything between "" with appropriate names, and delete the "" marks:
flexbar -t "New_file_name" -r "read_1_file_name" -p "read_2_file_name" -f fastq -a "fasta_file_of_adapter_sequences" -ao 1
Example command:
flexbar -t DED81 -r 02_Downloads/Sample_DED81_L004_R1.cat.fastq -p 02_Downloads/Sample_DED81_L004_R2.cat.fastq -f fastq -a 02_trimmed_Downloads/adapter_seq.fasta -ao 1
For a less conservative command, remove -ao 1
Flag explanations
Flag |
Text to follow |
What flag means |
Reason |
-t |
New_file_name |
Name of output file. |
Dictate what your output file is to be named. Suggest something different than input to avoid overwriting untrimmed. |
-r |
R1_source_file_name |
Name of Read1 sequencing file. |
File to remove adapters from. |
-p |
R2_source_file_name |
Name of Read2 sequencing file.(Optional: can do each file separately). |
File to remove adapters from. |
-f |
Format |
Format of reads. |
Most commonly will be fasta or fastq. |
-a |
Adapter_sequence_file.fasta |
Fasta file with full adapter sequences, degenerate bases allowed. |
What sequence is to be removed. |
-ao |
Number |
Number of bases of overlap between read and adapter |
This number equals the minimum number of bp to be removed. |
-at |
Number |
Number of mismatches and indels per 10bp of adapter sequence allowed |
This accounts for sequencing/PCR errors changing adapter sequence. Default = 3, increasing this number increases false positive rate, and decreases false negative rate. |
Additional Information
For additional information, type flexbar -h
-- Main.DanielDeatherage - 16 Jan 2013
Contributors to this topic
DanielDeatherage, JeffreyBarrick
Topic revision: r5 - 2013-03-08 - 17:11:33 - Main.JeffreyBarrick