Track topics on Twitter Track topics that are important to you
We describe a new genome alignment-based model for understanding the diversity of viruses based on evolutionary genetic relationships. This approach uses information theory and a physical model to determine the information shared by the genes in two genomes. Pairwise comparisons of genes from the viruses are created from alignments using NCBI BLAST, and their match scores are combined to produce a metric between genomes, which is in turn used to determine a global classification using the 5,817 viruses on RefSeq. In cases where there is no measurable alignment between any genes, the method falls back to a coarser measure of genome relationship: the mutual information of 4-mer frequency. This results in a principled model which depends only on the genome sequence, which captures many interesting relationships between viral families, and which creates clusters which correlate well with both the Baltimore and ICTV classifications. The incremental computational cost of classifying a novel virus is low and therefore newly discovered viruses can be quickly identified and classified. The model goes beyond alignment-free classifications by producing a full phylogeny similar to those constructed by virologists using qualitative features, while relying only on objective genes. These results bolster the case for mathematical models in microbiology which can characterize organisms using only their genetic material and provide an independent check for phylogenies constructed by humans, considerably faster and more cheaply than less modern approaches.
This article was published in the following journal.
Name: PloS one
Seventy years have passed since Ernest H. Runyon presented a phenotypic classification approach for nontuberculous mycobacteria (NTM), primarily as a starting point in trying to understand their clini...
This investigation establishes the first DNA-sequence-based phylogenetic hypothesis of species relationships in the coca family (Erythroxylaceae) and presents its implications for the intrageneric tax...
Herein the members of the Subcommittee on Taxonomy of Rhizobia and Agrobacteria of the International Committee on Systematics of Prokaryotes review recent developments in rhizobial and agrobacterial t...
Taxonomy for bacterial isolates is commonly assigned via sequence analysis. However, the most common sequence-based approaches (e.g. 16S rRNA gene-based phylogeny or whole genome comparisons) are stil...
Papillomaviruses infect humans and animals, most often causing benign proliferations on skin or mucosal surfaces. Rarely, these infections persist and progress to cancer. In humans, this transformatio...
The primary purpose of the study is to evaluate the association of a latent infection of lymphoid cells during the first manifestation of steroid sensitive nephrite syndrome. The thirty ni...
The purpose of this research study is to learn more about the use of viral specific T-lymphocytes (VSTs) to prevent viral infections that may happen after allogeneic stem cell transplant. ...
The main goal of this study is to identify and characterise the anatomical component of the replication competent HIV-1 (Human Immunodeficiency Virus-1) reservoir. The investigators hypot...
The study "Investigating the Feasibility and Implementation of Whole Genome Sequencing in Patients With Suspected Genetic Disorder" is a research study that aims to explore the use of whol...
Adult liver cancer is the third leading cause of cancer deaths worldwide. The major risk factor for liver cancer is hepatitis B virus (HBV) infection. The purpose of the study is to sequen...
Viral proteins that are components of the mature assembled VIRUS PARTICLES. They may include nucleocapsid core proteins (gag proteins), enzymes packaged within the virus particle (pol proteins), and membrane components (env proteins). These do not include the proteins encoded in the VIRAL GENOME that are produced in infected cells but which are not packaged in the mature virus particle,i.e. the so called non-structural proteins (VIRAL NONSTRUCTURAL PROTEINS).
Proteins encoded by a VIRAL GENOME that are produced in the organisms they infect, but not packaged into the VIRUS PARTICLES. Some of these proteins may play roles within the infected cell during VIRUS REPLICATION or act in regulation of virus replication or VIRUS ASSEMBLY.
The complete genetic complement contained in a DNA or RNA molecule in a virus.
Regulatory sequences important for viral replication that are located on each end of the HIV genome. The LTR includes the HIV ENHANCER, promoter, and other sequences. Specific regions in the LTR include the negative regulatory element (NRE), NF-kappa B binding sites , Sp1 binding sites, TATA BOX, and trans-acting responsive element (TAR). The binding of both cellular and viral proteins to these regions regulates HIV transcription.
Component of the NATIONAL INSTITUTES OF HEALTH. It conducts and supports research into the mapping of the human genome and other organism genomes. The National Center for Human Genome Research was established in 1989 and re-named the National Human Genome Research Institute in 1997.
Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...
Microbiology (from Greek μῑκρος, mīkros, "small"; βίος, bios, "life"; and -λογία, -logia) is the study of microscopic organisms, either unicellular (singl...