Update Kraken databases

Kraken is a really good k-mer based classification tool. I frequently use this tool for viral signal detection in metagenomic samples. A number of useful scripts such as updating Kraken databases are provided with the tool. Since the NCBI updated the FTP website structure and decided to phase-out Genbank Idenfiers (GIs), the default Kraken database update scripts […]

Read More

NCBI Entrez Direct UNIX E-utilities

I use NCBI Entrez Direct UNIX E-utilities regularly for sequence and data retrieval from NCBI. These UNIX utils can be combined with any UNIX commands. It is available to download from the NCBI website: ftp://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/ A few useful examples for NCBI edirect utilities. Download a sequence in fasta format from NCBI using accession number esearch -db […]

Read More

Top tips to keep your home folder on a server tidy

All bioinformatics server users and administrators would know how easy it is to fill up our home directories with huge amounts of data, especially when you are analysing deep sequencing data on a daily basis. Here is a list of a few useful commands and tips that can help to keep your home directory tidy. […]

Read More

Convert NCBI Protein GI to Genome Accession

A few days back I posted a question on BioStars about getting genome accession numbers for a list of protein GIs. I had a long list of protein GI and I wanted the genome accession number for each protein GI (if there is one in NCBI databases) but without downloading files for each protein GI […]

Read More

Setting up automatic BLAST database update on linux servers

Basic Local Alignment Search Tool (BLAST) is one of the most commonly used programs for sequence classification using similarity search. Standalone BLAST can be setup easily on the local server. More info about how to set it up on a local Linux server can be found here: http://www.ncbi.nlm.nih.gov/books/NBK52640/ In our lab, all our servers run […]

Read More

Adding a new assembler in MetAMOS

MetAMOS is a modular metagenomic analysis pipeline that can be used to automate metagenomic assembly, annotation and scaffolding analysis. Further information and published paper about MetAMOS can be found here and documentation about the pipeline can be found here. One of the biggest advantages of using the MetAMOS framework is the modularity of the framework. […]

Read More