Setting up automatic BLAST database update on linux servers

Basic Local Alignment Search Tool (BLAST) is one of the most commonly used programs for sequence classification using similarity search.

Standalone BLAST can be setup easily on the local server. More info about how to set it up on a local Linux server can be found here:

In our lab, all our servers run the BioLinux operating system and BLAST is pre-installed on the server. With local BLAST, it is important to update local BLAST databases regularly to include new sequences submitted to NCBI. However, sometimes it does become a bit tricky to install and regularly update these databases.

Here is a small tutorial about how to setup local BLAST databases and regularly update them.

In BioLinux, the BLASTDB variable path is usually set up to /var/lib/blastdb and is specified in the file in /etc/profile.d/

The standard file looks like this.

#Added by package bio-linux-blast
# Ages ago we had a directory called /home/db/blastdb but new users don't want that.
# /var makes most sense, as it is more likely to be a local disk and suitable for "variable" data.

if [ -e /home/db/blastdb ] ; then

    #customised BLASTDB location
    export BLASTDB=/home/db/blastdb
# elif [ -e /var/lib/blastdb ] ; then
    #default BLASTDB location
    export BLASTDB=/db/blast

BLASTDB path can be updated to /your/blastdb/location by changing details in the “if” statement of the file.

The following example shows how I will change the location to my customized blastdb in my home directory /home/sejalmodha/blastdb

if [ -e /home/sejalmodha/blastdb] ; then
    export BLASTDB = /home/sejalmodha/blastdb
    export BLASTDB = /home/db/blastdb

On a standard linux server you can specify the BLASTDB path variable in /etc/bash.bashrc or in your local ~/.bashrc

BLASTDB = /home/sejalmodha/blastdb
export BLASTDB

To update these databases regularly on the server, use NCBI’s update_blastdb script and wrap it in a cronjob.

I have an script that downloads nr, nt and refseq_protein databases from the NCBI website and changes the permissions of those files so that all users can use the files.

cd /home/sejalmodha/blastdb/
update_blastdb --passive --decompress nr nt refseq_protein
chown root *
chgrp users *
chmod 755 *

To schedule the downloading of these databases monthly, put it in a cronjob called blast_cronrun and save the log to download.log file.

@monthly       /home/sejalmodha/blastdb/ > /home/sejalmodha/blastdb/download.log 2>&1

The last step is to submit the cronjob using the crontab command.

crontab blast_cronrun