Java CIGAR Parser for SAM format

Sequence Alignment/Map (SAM) format is a well-known bioinformatics format designed to store  information about reads mapping against large reference sequence.  The SAM file is split into two sections: a header section and an alignment section. The header section starts with ‘@’ and it contains information such as the name and length of the reference sequence. […]

Read More

Trie Data Structure

In Computer Science, a trie is a data structure that is also known as a digital search tree or a prefix tree. It can be used for fast retrieval on large data sets such as looking up words in a dictionary. The term trie was invented from the phrase ‘Information Retrieval’ by Fredkin(1960). As a […]

Read More