The WhatsHap source code is on GitHub. WhatsHap is developed in Python 3, Cython and C++.
For development, make sure that you install Cython and tox. We also recommend
using a virtualenv. This sequence of commands should work (use
https://github.com/whatshap/whatshap.git as URL if you do not have a
git clone email@example.com:whatshap/whatshap.git cd whatshap python3 -m venv venv source venv/bin/activate pip install -e .[dev]
The last command installs also all the development dependencies, such as Cython.
[dev] (write only
pip install -e .) to leave them out.
Next, you can run WhatsHap like this:
Development installation when using Conda¶
If you are using Bioconda, it is convenient to develop WhatsHap in a separate environment:
conda create -n whatshap-dev python=3.6 pysam PyVCF pyfaidx xopen Cython pytest sphinx-issues source activate whatshap-dev git clone firstname.lastname@example.org:whatshap/whatshap.git cd whatshap pip install -e .
The last command installs WhatsHap into your Conda environment named
whatshap-dev. So when
whatshap you will run the latest version you just cloned.
While in the virtual environment, you can run the tests for the current Python version like this:
Whenever you change any Cython code (
.pyx files), you need to re-run the
pip install -e . step in order to compile it.
Optionally, to run tests for all supported Python versions, you can run
tox. It creates separate virtual environments for each Python
version, installs WhatsHap into it and then runs the tests. It also tests documentation generation
sphinx. Run it like this:
tox is installed on the system, you do not need to be inside a virtual environment for this.
However, you need to have all tested Python versions installed on the system! See the instructions
below for how to do this on Ubuntu.
Installing other Python versions in Ubuntu¶
Ubuntu comes with one default Python 3 version, and in order to test WhatsHap with older or newer Python versions, follow the instructions for enabling the “deadsnakes” repository. After you have done so, ensure you have the following packages:
sudo apt install build-essential python-software-properties
Then get and install the desired Python versions. Make sure you install the
For example, for Python 3.4:
sudo apt update sudo apt install python3.4-dev
Here is one way to get a backtrace from gdb (assuming the bug occurs while running the tests):
$ gdb python3 (gdb) run -m pytest
After you get a SIGSEGV, let gdb print a backtrace:
Another way is to set
PYTHONFAULTHANDLER=1 pytest -vxs tests/test_run_whatshap.py
Wrapping C++ classes¶
The WhatsHap phasing algorithm is written in C++, as are many of the core data structures such as the “Read” class. To make the C++ classes usable from Python, we use Cython to wrap the classes. All these definitions are spread across multiple files. To add new attributes or methods to an existing class or to add a new class, changes need to be made in different places.
Let us look at the “Read” class. The following places in the code may need to be changed if the Read class is changed or extended:
src/read.cpp: Implementation of the class (C++).
src/read.h: Header with the class declaration (also normal C++).
whatshap/cpp.pxd: Cython declarations of the class. This repeats – using the Cython syntax this time – a subset of the information from the
src/read.hfile. This duplication is required because Cython cannot read
.hfiles (it would need a full C++ parser for that).
Note that the
cpp.pxdfile contains definitions for all the
.hheaders. (It would be cleaner to have them in separate
.pxdfiles, but this leads to problems when linking the compiled files.)
whatshap/core.pxd: This contains declarations of all Cython classes wrapping C++ classes. Note that the class
Readin this file has the same name as the C++ class, but that it is not the same as the C++ one! The distinction is made by prefixing the C++ class with
cpp., which is the name of the module in which it is declared in (that is, the C++ class
Readis declared in
cpp.pxd). The wrapping (Cython) class
Readstores the C++ class in an attribute named
thisptr. If you add a new class, it needs to be added to this file. If you only modify an existing one, you probably do not need to change this file.
whatshap/core.pyx: The Cython implementation of the wrapper classes. Again, the name
Readby itself is the Python wrapper class and
cpp.Readis the name for the C++ class.
Before adding yet more C++ code, which then requires extra code for wrapping it,
consider writing an implementation in Cython instead. See
for example, which started out as a Python module and was then transferred to
Cython to make it faster. Here, the Cython code is not merely a wrapper, but
contains the implementation itself.
Documentation is hosted on Read the Docs.
It is built automatically whenever a commit is made. The documentation in the
master branch should be visible at
and documentation for the most recent released version should be visible at
To generate documentation locally, ensure that you installed sphinx and the
add-ons necessary to build documentation (running
pip install -e .[dev] will
take care of this). Then go into the
doc/ directory and run
make. You can
doc/_build/html/index.html in your browser. The theme that is
used is a bit different from the one used on Read the Docs.
Making a release¶
CHANGES.rst: Set the correct version number and ensure that all nontrivial, user-visible changes are listed.
Ensure you have no uncommitted changes in the working copy.
Tag the current commit with the version number (there must be a
git tag -a -m "Version 0.1" v0.1
To release a development version, use a
devversion number such as
v0.17.dev1. Users will only get these when they use
pip install --pre.
Push the tag:
git push --tags
Wait for the GitHub Action to finish. It will deploy the sdist and wheels to PyPI if everything worked correctly.
Update the Bioconda recipe. It is easiest to wait for the Bioconda bot to open a PR. Ensure that the list of dependencies (the
requirements:section in the recipe) is in sync with the
setup.pyfile. If changes are necessary to the bot-generated PR, just add your own commits to that PR.
If something went wrong, fix the problem and follow the above instructions again, but with an incremented revision in the version number. That is, go from version x.y to x.y.1. PyPI will not allow you to change a version that has already been uploaded.
Adding a new subcommand¶
Use one of the modules in
whatshap/cli/ as a template. All modules in
that directory are automatically used as subcommands.
Download count statistics¶
Some statistics for the PyPI package are available at pypistats.org.
Here is a query for Google BigQuery that shows download counts (from PyPI) since a given date, broken down by version
SELECT file.project, file.version, COUNT(*) as total_downloads, FROM TABLE_DATE_RANGE( [the-psf:pypi.downloads], TIMESTAMP("20170101"), CURRENT_TIMESTAMP() ) WHERE file.project = 'whatshap' GROUP BY file.project, file.version
Statistics for the Conda package are available on the WhatsHap package detail page.