Bioinformatics Master List

This page is a master list of bioinformatics software frequently used in the lab, and their purposes. It also includes additional notes about what are good/ bad use cases, caveats or known issues. Most of these tools are likely available from bioconda or bioconductor and we advise against downloading source code and compiling it yourself.

Figuring out what software is best suited for the task can be hard. It can take some trial and error with a small datatset, but these are also general good guidelines.

Next-Generation Sequencing and Genomics

  • Online sequence manipulation: a wide variety of small tasks for protein and DNA sequences
  • fastp: trim adapters and perform quality control on short reads.
  • bowtie2: align short reads to a reference sequence.
  • breseq: call mutations from single end or paired end illumina reads against a given reference sequence of a haploid microbial genome. breseq can be used with long-read data, but is not optimised for this purpose.
  • seabreeze: call structural variants between bacterial genomes. Also annotates genomes and visualizes the structural variation.
  • minimap2: align long reads against a reference sequence.
  • mummer: generate whole genome alignments
  • prokka: annotate features on prokaryotic genomes
  • ISEScan: find insertion sequences in bacterial genomes.
  • Trycycler: generate de novo consensus genome assemblies from long-read sequencing. Trycycler combines the output of several individual assemblers. It also has a successor, Autocycler.
  • velvet: de novo assembly of short reads.
  • samtools: Handling common NGS and alignment file formats (SAM, BAM).
  • sratools: Download data from the NCBI SRA
  • featureCounts: counts mapped reads for genomic features. Used in RNA-sequencing. Often used upstream of DESeq2. Requires an alignment file first.
  • DESeq2: perform differential gene expression analysis. Often used downsteam of something like featureCounts.
  • treeview: View phylogenetic trees. Accepts trees in newick format.
  • Muscle: multiple sequence alignment. Usually used as a first step to constructing phylogenetic trees.

Visualizations

Misc

  • Snakemake: Python-based workflow manager. Integrates nicely with conda to manage software dependencies.

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | More topic actions

 Barrick Lab  >  ReferenceList  >  ListofBioinformatics

Contributors to this topic Edit topic IraZibbu
Topic revision: r1 - 2025-03-20 - 21:40:32 - Main.IraZibbu
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright ©2025 Barrick Lab contributing authors. Ideas, requests, problems? Send feedback