weeSAM version 1.5.

What is weeSAM? weeSAM is a python script which produces coverage statistics and coverage plots from an input SAM or BAM file. Figures and stats are written up in HTML so users can easily view the coverage for their reference assembly. weeSAM is simple to run and the steps below give an illustration. What’s new […]

Read More

Extensive but not comprehensive compilation of de-novo assemblers

This figure is an update of Figure 1 in “A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies.” published by Zhang et al (2011). The figure was produced in SVG so you should be able to click on the name of the assembler which should take you straight to the […]

Read More

Exploring the FAST5 format

FAST5 format from Oxford Nanopore (ONT) is in fact HDF5, which is a very flexible data model, library, and file format for storing and managing data. It is able to store an unlimited variety of datatypes. A number of tools have been developed for handling HDF5 available from here.  The most useful are: hdfview, a […]

Read More

Update Kraken databases

Kraken is a really good k-mer based classification tool. I frequently use this tool for viral signal detection in metagenomic samples. A number of useful scripts such as updating Kraken databases are provided with the tool. Since the NCBI updated the FTP website structure and decided to phase-out Genbank Idenfiers (GIs), the default Kraken database update scripts […]

Read More

How to generate a Sample Sheet from sample/index data in BaseSpace

If you are using BaseSpace for sample entry but demultiplexing your data manually, you may have been frustrated that there is no facility to download your sample names and index tag data from BaseSpace as a sample sheet. This means you have to enter the same data twice – with the possibility of errors creeping […]

Read More