(r2) Barrick Lab :: BioPythonOnboard

<noautolink>
<!--
   * Set PAGETITLE = Barrick Lab :: python for biology
-->
---+ <b>Getting started with Python for biology</b>

Python has become a dominant programming language for asking and answering many different questions in biology. This is in part because of its relative simplicity yet powerful flexibility, but also because Python has extensive library support for various biological tools.

---++ Getting started

There are many different ways to run Python, but one of the easiest choices to get started is [[https://colab.research.google.com/][Google Colabroratory]]. This is essentially an online [[https://jupyter.org/][Jupyter Notebook]], which has many popular packages pre-installed, and removes much of the hassle that can come from a local intallation of python.

[[https://jupyter.org/][Jupyter Notebooks]] are a great tool for keeping track of what the user has tried, and also seemlessly integrating spaces where the user can describe their thought process or other notes. It is conceptually similar to a lab notebook that one might use in a wet lab.

 If a local installation of Python is prefered, [[https://www.anaconda.com/][Anaconda]] is a great choice. Anaconda is a data science platform that installs many useful tools, including Jupyter Notebook, and allows for easier package management. Packages contain libraries of useful code that other people have written making Python easier to use. [[https://www.anaconda.com/][Biopython]] is one such such library, and contains indespensible tools such as FASTA/GBK parsers/writers, and manipulating DNA/protein sequences.

---++ Running Jupyter

---++ Installing Packages
Once you feel comfortable with Jupyter Notebook, we need to install some missing packages. This can be accomplished by using `pip` -- this is a python package installer. In an empty cell, run this code: `!pip3 install biopython`. This should only have to be run once, and will persist across sessions.


*Additional test data and an example of __breseq__ output:*
| [[%ATTACHURL%/REL8593A.fastq.gz][%ICON{download}% FASTQ reads for _E. coli_ strain REL8593A (200M)]] |
| [[%ATTACHURL%/REL606.gbk.gz][%ICON{download}% Reference genome (REL606)]] |
| [[%ATTACHURL%/REL8593A_output][%ICON{external}% Example of __breseq__ output]] |
Barrick Lab > ToolList > BioPythonOnboard
Contributors to this topic
MattMcGuffie, JeffreyBarrick
Topic revision: r2 - 2020-04-09 - 19:38:07 - Main.MattMcGuffie