ViCTree

A Virus Taxonomy Classification Framework

View the Project on GitHub josephhughes/ViCTree

ViCTree Home

Case Study: Densovirinae Example

Tree visualisation features in ViCTreeView

Frequently Asked Questions

Installation

ViCTree can be installed on any UNIX/LINUX machine. This pipeline is tested and works on Ubuntu machines with multiple cores.

Fork the repository to add to your GitHub account. This is required for the visualisation as the tree data is fetched from the user's GitHub account. For more information see: Setting up GitHub repository for visualisation on this page.

ViCTree can be downloaded from GitHub using clone utility.

A sample command to clone ViCTree is:

  git clone --recursive https://github.com/yourUserName/ViCTree.git  
Note: Please replace the URL with your ViCTree repository URL to enable automated GitHub data upload.

Prerequisites

The following programs must be available in the $PATH environment.

Binary executables of a number of tools are available to download from GitHub repository with their correponding licenses for MACOSX and Linux versions.

Note: Programs versions described above are tested and work with ViCTree pipeline. Later version of these programs should work with ViCTree. Please contact the developers for version compatibility issues.

Download ViCTree package or clone ViCTree from this page and make sure all scripts are available in $PATH.

Running ViCTree

  Usage: ViCTree [OPTIONS]
		
		-t Taxa ID - INT(Required) 
		-s Seedset in fasta format (Required) 
		-l Hit Length for BLAST - INT(Required) 
		-c Coverage for BLAST -INT(Required) 
		-h This helpful message
		-m Specify model for RAxML (Default is PTRGAMMJTT)
		-i Identity for clustering sequences using cdhit 
		-n Output name of the virus family or sub-family (Required e.g. Densovirinae)  
		-p Number of threads
		-u User-defined list of accession numbers to be set as cluster representatives
 

To run the ViCTree pipeline you will need to identify two mandatory parameters.

The following command will launch the ViCTree analysis pipeline for Densovirinae sub-family.

ViCTree.sh -t 40120 -s txid40120_seeds.fa -l 100 -c 50 -p 10  -i 1.0 -n Densovirinae

Setting up GitHub repository for visualisation

Update user_id, repo_name and branch parameters to your forked repository in the following lines in the file index.html in ViCTreeView sub-directory.

var user_id = "josephhughes"
var repo_name = "ViCTree"
var branch = "master"
var dir = "ViCTreeView/data"

Note: If you have forked the repository without changing the name then just update the user_id to your username in this file.

Output

When the pipeline is run for the first time a folder with the taxID name is created that saves all the output files generated by the pipeline.

The main output files generated by the pipeline include:

Filename Contents
taxID.fa All protein sequences downloaded from NCBI for the specified taxID
taxID_final_set.fa The final set of sequences used for Multiple Sequence Alignment
taxID_tree.nhx The final tree generated and rerooted by RAxML
taxID_clustalo_dist_mat.csv The pairwise distance matrix file
taxID_metadata The metadata file for each sequence from taxID_final_set.fa that exists in NCBI

Visualisation

ViCTreeView is a visualisation plugin for ViCTree. This is a customised phylogenetic tree visualisation plugin developed using D3.js. It reads input files from data directory of this repository and displays the phylogenetic tree.

ViCTreeView has a scroll bar at the top that corresponds to the pairwise distances between the nodes of the tree. User can select any value between 0 and 100 based on which clusters within the tree can be highlighted.

Contact

The ViCTree framework is developed by :

Sejal Modha (@sejmodha), Anil Thanki (@anilthanki) and Joseph Hughes (@josephhughes).