PYTHON FOR BIOINFORMATICS

Second Edition

A solid introduction to programming with Python, accessible for readers without previous programming experience. Written for biologists, bioinformatics specialists and bench scientists in mind.

Book Contents

The book will have four main sections:


  • Python from scratch: Basic programming concepts, Installing Python, Interactive mode, Editors, Data types (Strings, Unicode, Lists, Tuples, Dictionaries, Sets), Flow control (If-Else, For, While), Functions, Generators, Modules, Using files including CSV and JSON and file operations, Error handling and Object Orienting Programming.
  • Biopython: Most important Biopython modules explained with sample usage.
  • A section with advanced topics such as: Web development (CGI and Bottle), XML, Databases (MySQL, SQLite and MongoDB), REGEX and Graphics (Bokeh).
  • Python recipes with commented source code.

Note: Book is under development so final content may differ

Why a second edition?

There were a lot of changes since the first edition was written in 2009. Enterprise attitude and support to Open Source Software in general and Python in particular has changed dramatically. Microsoft already support Python as a first class citizen in its Visual Studio development and in Azure. Current Python version is 3.6. Collaborative software development with Git and Github is the norm. Web development is another area that changed significantly over the last seven years. Frameworks replaced CGI/WSGI and middleware based applications. Apart from software evolution, the author gained development experience in a genome sequencing project at an international consortium and as a Senior Software Developer in a NYSE listed company.

The Author

Sebastián Bassi is a Biotechnologist with experience both in software development and bioinformatics research. He worked in a leader biotech company doing molecular marker database curation and in a national research institute helping with the bioinformatics support of the international effort to sequence the Tomato Genome. Both positions involved Python development and intensive data manipulation. He made a web application to query a micro RNA database, which was published at BMC Plant Biology. He also worked on the first Linux distribution for bioinformatics (DNALinux) . Currently he is doing consultant work for Globant, assigned to PLOS. He is an AWS Certified Solutions Architect and is frequently invited to Python conferences.

Sebastian Bassi with DNA sequencer

Get the Code

All code examples from the book are available from Github or as a Jupyter Notebook that can be run online.
Book is under development, only partial source code is available at this time
(Source code for First Edition still can be found here)

Source at Github.

Go to the book Github page and click on the green "Clone or download" button. The project includes all .py files ready to be executed locally and complementary files used in the book.

Jupyter Notebook

Code can be run online at Microsoft Azure Notebook (free account is required). Jupyter notebooks (in .ipynb format) can also be downloaded from the Notebooks directory and run locally if you have Jupyter installed.