About this website

This website aims to provide a stable repository for the virus fossil record. We provide data and bioinformatic tools for working with the data. We also post short articles describing our research, and discussing recent advances in the field of viral evolution.

The website is allied to a bioinformatic 'pipeline' for screening genome sequences in silico. Our goal is to provide a comprehensive fossil record for the reference EVEs in our libraries by creating MSAs that incorporate all homologous loci identified by screening.

The EVE reference sequence libraries

We provide paleovirus reference sequences for retroviruses here. Reference sequences for non-retroviral paleoviruses will be provided in the near future. Our reference sequences are provided in our native format (GLUE) and also as FASTA sequences. The current version of the ERV reference sequence library (2013-05-25) is a development version that is under consolidation during April 2013. It includes previously published consensus sequences, as well as 'reference' loci/isolates selected to represent particular retroviral lineages. Our ultimate goal is to provide estimated ancestral sequences for ERV lineages that have sufficiently high copy number.

Representing the fossil history of EVE lineages

Our approach to organization of the sequence data that comprise the virus fossil record is based on the use of a framework developed in our lab, called GLUE (Genes Linked by their Underlying Evolution). In the GLUE framework, multiple sequence alignments (MSAs) are generated against standard 'reference sequences', and the positions and sequences of insertions relative to the reference sequence are spliced out and stored in tables. This approach expedites the progressive assembly of paleovirus MSAs, and circumvents many of the problems encountered when MSAs contain pseudogenes and large indels.