DisCVR: Rapid viral diagnosis from NGS data
DisCVR is a computer program which allows diagnosticians to detect known human viruses in clinical samples from Next Generation Sequencing (NGS) data. It works by creating a database of short nucleotide sequences, called k-mers, which are extracted from viral genomes. K-mers of 31 bases in length are generated by sliding a window along a sequence, 1 nucleotide base at a time. Only unique k-mers from a set of viruses are included in the database and assigned taxonomic labels. To investigate a patient sample sequenced using NGS, the database is queried to find exact matches with k-mers. The output shows a list of all viruses found in the sample. In addition, reference-based assembly can be used to assess the significance of matches.
DisCVR is a fast and accurate tool designed to analyse NGS data and validate the results interactively on computers with limited resources. At present, DisCVR is a human viral diagnostic tool, but it could be extended to include non-viral human pathogens as well as pathogens of other hosts.