Setting up automatic BLAST database update on linux servers

Basic Local Alignment Search Tool (BLAST) is one of the most commonly used programs for sequence classification using similarity search.

Standalone BLAST can be setup easily on the local server. More info about how to set it up on a local Linux server can be found here:

http://www.ncbi.nlm.nih.gov/books/NBK52640/

In our lab, all our servers run the BioLinux operating system and BLAST is pre-installed on the server. With local BLAST, it is important to update local BLAST databases regularly to include new sequences submitted to NCBI. However, sometimes it does become a bit tricky to install and regularly update these databases.

Here is a small tutorial about how to setup local BLAST databases and regularly update them.

In BioLinux, the BLASTDB variable path is usually set up to /var/lib/blastdb and is specified in the blast_environment.sh file in /etc/profile.d/blast_environment.sh

The standard blast_environment.sh file looks like this.

BLASTDB path can be updated to /your/blastdb/location by changing details in the “if” statement of the file.

The following example shows how I will change the location to my customized blastdb in my home directory /home/sejalmodha/blastdb

On a standard linux server you can specify the BLASTDB path variable in /etc/bash.bashrc or in your local ~/.bashrc

To update these databases regularly on the server, use NCBI’s update_blastdb script and wrap it in a cronjob.

I have an update_db.sh script that downloads nr, nt and refseq_protein databases from the NCBI website and changes the permissions of those files so that all users can use the files.

To schedule the downloading of these databases monthly, put it in a cronjob called blast_cronrun and save the log to download.log file.

The last step is to submit the cronjob using the crontab command.

 

Bioinformatician at CVR.
http://bioinformatics.cvr.ac.uk/sejal.php
  • Alaa Alarfaj

    Do you have some idea how to do it in windows?

  • Sejal Modha

    Hi, you can download BLAST executables for Windows from NCBI website and set PATH variable to define the location of local BLAST databases. Although I have not tried running update_blastdb.pl on Windows system, I think the script should work as long as you have all Perl modules installed.
    I am not sure about cron equivalent utility for Windows though.