Viral taxonomy derived from evolutionary genome relationships.

08:00 EDT 14th August 2019 | BioPortfolio

Summary of "Viral taxonomy derived from evolutionary genome relationships."

We describe a new genome alignment-based model for understanding the diversity of viruses based on evolutionary genetic relationships. This approach uses information theory and a physical model to determine the information shared by the genes in two genomes. Pairwise comparisons of genes from the viruses are created from alignments using NCBI BLAST, and their match scores are combined to produce a metric between genomes, which is in turn used to determine a global classification using the 5,817 viruses on RefSeq. In cases where there is no measurable alignment between any genes, the method falls back to a coarser measure of genome relationship: the mutual information of 4-mer frequency. This results in a principled model which depends only on the genome sequence, which captures many interesting relationships between viral families, and which creates clusters which correlate well with both the Baltimore and ICTV classifications. The incremental computational cost of classifying a novel virus is low and therefore newly discovered viruses can be quickly identified and classified. The model goes beyond alignment-free classifications by producing a full phylogeny similar to those constructed by virologists using qualitative features, while relying only on objective genes. These results bolster the case for mathematical models in microbiology which can characterize organisms using only their genetic material and provide an independent check for phylogenies constructed by humans, considerably faster and more cheaply than less modern approaches.


Journal Details

This article was published in the following journal.

Name: PloS one
ISSN: 1932-6203
Pages: e0220440


DeepDyve research library

PubMed Articles [13853 Associated PubMed Articles listed on BioPortfolio]

Nontuberculous mycobacteria: Insights on taxonomy and evolution.

Seventy years have passed since Ernest H. Runyon presented a phenotypic classification approach for nontuberculous mycobacteria (NTM), primarily as a starting point in trying to understand their clini...

Phylogenetic inference in section Archerythroxylum informs taxonomy, biogeography, and the domestication of coca (Erythroxylum species).

This investigation establishes the first DNA-sequence-based phylogenetic hypothesis of species relationships in the coca family (Erythroxylaceae) and presents its implications for the intrageneric tax...

Minimal standards for the description of new genera and species of rhizobia and agrobacteria.

Herein the members of the Subcommittee on Taxonomy of Rhizobia and Agrobacteria of the International Committee on Systematics of Prokaryotes review recent developments in rhizobial and agrobacterial t...

Genomes from bacteria associated with the canine oral cavity: A test case for automated genome-based taxonomic assignment.

Taxonomy for bacterial isolates is commonly assigned via sequence analysis. However, the most common sequence-based approaches (e.g. 16S rRNA gene-based phylogeny or whole genome comparisons) are stil...

Viral genome integration of canine papillomavirus 16.

Papillomaviruses infect humans and animals, most often causing benign proliferations on skin or mucosal surfaces. Rarely, these infections persist and progress to cancer. In humans, this transformatio...

Clinical Trials [3595 Associated Clinical Trials listed on BioPortfolio]

Latent Viral Infection of Lymphoid Cells in Idiopathic Nephrotic Syndrome

The primary purpose of the study is to evaluate the association of a latent infection of lymphoid cells during the first manifestation of steroid sensitive nephrite syndrome. The thirty ni...

Donor-Derived Viral Specific T-cells (VSTs) for Prophylaxis Against Viral Infections After Allogeneic Stem Cell Transplant

The purpose of this research study is to learn more about the use of viral specific T-lymphocytes (VSTs) to prevent viral infections that may happen after allogeneic stem cell transplant. ...

HIV Sequencing After Treatment Interruption to Identify the Clinically Relevant Anatomical Reservoir

The main goal of this study is to identify and characterise the anatomical component of the replication competent HIV-1 (Human Immunodeficiency Virus-1) reservoir. The investigators hypot...

Investigating the Feasibility and Implementation of Whole Genome Sequencing in Patients With Suspected Genetic Disorder

The study "Investigating the Feasibility and Implementation of Whole Genome Sequencing in Patients With Suspected Genetic Disorder" is a research study that aims to explore the use of whol...

Viral & Host Factors Associated With Hepatitis B Virus-related Hepatocelluar Carcinoma

Adult liver cancer is the third leading cause of cancer deaths worldwide. The major risk factor for liver cancer is hepatitis B virus (HBV) infection. The purpose of the study is to sequen...

Medical and Biotech [MESH] Definitions

Viral proteins that are components of the mature assembled VIRUS PARTICLES. They may include nucleocapsid core proteins (gag proteins), enzymes packaged within the virus particle (pol proteins), and membrane components (env proteins). These do not include the proteins encoded in the VIRAL GENOME that are produced in infected cells but which are not packaged in the mature virus particle,i.e. the so called non-structural proteins (VIRAL NONSTRUCTURAL PROTEINS).

Proteins encoded by a VIRAL GENOME that are produced in the organisms they infect, but not packaged into the VIRUS PARTICLES. Some of these proteins may play roles within the infected cell during VIRUS REPLICATION or act in regulation of virus replication or VIRUS ASSEMBLY.

The complete genetic complement contained in a DNA or RNA molecule in a virus.

Regulatory sequences important for viral replication that are located on each end of the HIV genome. The LTR includes the HIV ENHANCER, promoter, and other sequences. Specific regions in the LTR include the negative regulatory element (NRE), NF-kappa B binding sites , Sp1 binding sites, TATA BOX, and trans-acting responsive element (TAR). The binding of both cellular and viral proteins to these regions regulates HIV transcription.

Component of the NATIONAL INSTITUTES OF HEALTH. It conducts and supports research into the mapping of the human genome and other organism genomes. The National Center for Human Genome Research was established in 1989 and re-named the National Human Genome Research Institute in 1997.

Quick Search

DeepDyve research library

Relevant Topics

Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...

Microbiology (from Greek μῑκρος, mīkros, "small"; βίος, bios, "life"; and -λογία, -logia) is the study of microscopic organisms, either unicellular (singl...

Searches Linking to this Article