weeSAM version 1.5.

What is weeSAM? weeSAM is a python script which produces coverage statistics and coverage plots from an input SAM or BAM file. Figures and stats are written up in HTML so users can easily view the coverage for their reference assembly. weeSAM is simple to run and the steps below give an illustration. What’s new […]

Read More

Exploring the FAST5 format

FAST5 format from Oxford Nanopore (ONT) is in fact HDF5, which is a very flexible data model, library, and file format for storing and managing data. It is able to store an unlimited variety of datatypes. A number of tools have been developed for handling HDF5 available from here.  The most useful are: hdfview, a […]

Read More

NGS Data Formats and Analyses

Here are my slides from a session on NGS data formats and analyses that I gave as part of the EPIZONE Workshop on Next Generation Sequencing applications and Bioinformatics in Brussels in April 2016. It covers file formats such as FASTA, FASTQ, SAM, BAM, and VCF, and also goes over IUAPAC nucleotide ambiguity codes, read names, quality […]

Read More

Java CIGAR Parser for SAM format

Sequence Alignment/Map (SAM) format is a well-known bioinformatics format designed to store  information about reads mapping against large reference sequence.  The SAM file is split into two sections: a header section and an alignment section. The header section starts with ‘@’ and it contains information such as the name and length of the reference sequence. […]

Read More

The dark arts of Ion Torrent Sequencing

All technologies, have their advantages and disadvantages, and next generation sequencing (NGS) is no different. For the time being, the current dominant forces in NGS are Illumina, which is based upon detecting a flash of light as fluorescent nucleotides are incorporated, or Ion Torrent, where nucleotide incorporation is detected in the form of a change […]

Read More

Some key factors for number of significant DE genes

In the different RNA-Seq experiments, we have found that some experiments have more significant DE (differential expression) genes than others. Besides the experiment design which is the most important factor, a number of other factors may influence the number of DE genes. Here, we list the different experiments that we have carried out the investigation […]

Read More