Sequence Alignment on Directed Graphs.

08:00 EDT 8th September 2018 | BioPortfolio

Summary of "Sequence Alignment on Directed Graphs."

Genomic variations in a reference collection are naturally represented as genome variation graphs. Such graphs encode common subsequences as vertices and the variations are captured using additional vertices and directed edges. The resulting graphs are directed graphs possibly with cycles. Existing algorithms for aligning sequences on such graphs make use of partial order alignment (POA) techniques that work on directed acyclic graphs (DAGs). To achieve this, acyclic extensions of the input graphs are first constructed through expensive loop unrolling steps (DAGification). Furthermore, such graph extensions could have considerable blowup in their size and in the worst case the blow-up factor is proportional to the input sequence length. We provide a novel alignment algorithm V-ALIGN that aligns the input sequence directly on the input graph while avoiding such expensive DAGification steps. V-ALIGN is based on a novel dynamic programming (DP) formulation that allows gapped alignment directly on the input graph. It supports affine and linear gaps. We also propose refinements to V-ALIGN for better performance in practice. With the proposed refinements, the time to fill the DP table has linear dependence on the sizes of the sequence, the graph, and its feedback vertex set. We conducted experiments to compare the proposed algorithm against the existing POA-based techniques. We also performed alignment experiments on the genome variation graphs constructed from the 1000 Genomes data. For aligning short sequences, standard approaches restrict the expensive gapped alignment to small filtered subgraphs having high similarity to the input sequence. In such cases, the performance of V-ALIGN for gapped alignment on the filtered subgraph depends on the subgraph sizes.


Journal Details

This article was published in the following journal.

Name: Journal of computational biology : a journal of computational molecular cell biology
ISSN: 1557-8666


DeepDyve research library

PubMed Articles [5527 Associated PubMed Articles listed on BioPortfolio]

GfaViz: Flexible and interactive visualization of GFA sequence graphs.

The Graphical Fragment Assembly (GFA) formats are emerging standard formats for the representation of sequence graphs. While GFA 1 was primarely targeting assembly graphs, the newer GFA 2 format intro...

Variation graph toolkit improves read mapping by representing genetic variation in the reference.

Reference genomes guide our interpretation of DNA sequence data. However, conventional linear references represent only one version of each locus, ignoring variation in the population. Poor representa...

Aligning Optical Maps to De Bruijn Graphs.

Optical maps are high resolution restriction maps that give a unique numeric representation to a genome. Used in concert with sequence reads, they provide a useful tool for genome assembly and for dis...

Incorporating alignment uncertainty into Felsenstein's phylogenetic bootstrap to improve its reliability.

Most evolutionary analyses are based on pre-estimated multiple sequence alignment. Wong et al. established the existence of an uncertainty induced by multiple sequence alignment when reconstructing ph...

Motif-Aware PRALINE: Improving the alignment of motif regions.

Protein or DNA motifs are sequence regions which possess biological importance. These regions are often highly conserved among homologous sequences. The generation of multiple sequence alignments (MSA...

Clinical Trials [1534 Associated Clinical Trials listed on BioPortfolio]

Matched Pair Study - Kinematic vs Mechanical Alignment

The aim of this study is to evaluate postoperative knee function after total knee arthroplasty performed according to the anatomical alignment and compare these results to those of a match...

Comparison of In Vivo Alignment With TruMatch™ Personalized Solutions Compared to Conventional Instrumentation in Total Knee Replacement (TKA)

This investigation is intended to provide clinical information about alignment using TruMatch™ and to compare the results to a conventional total knee replacement. TruMatch™ will be c...

A Study to Investigate Drug-Drug Interaction Between D326, D337 and CKD-828 in Healthy Subjects

To evaluate pharmacokinetic properties and drug interactions between D326 and D337 co-administered groups, the CKD-828 alone and the total co-administered groups.

Traditional Versus Alternative Alignment in TKR

As many as 20% of patients are unhappy with the results of total knee replacement (TKR). Various changes to surgical technique have tried to address this but have not led to a significant ...

Hindfoot Alignment in Total Knee Replacement

When carrying out a knee replacement operation one of the goals is to correct any deformity of the leg (bowlegged or knock kneed). The ideal alignment is the mechanical axis, which is a li...

Medical and Biotech [MESH] Definitions

An isothermal in-vitro nucleotide amplification process. The process involves the concomitant action of a RNA-DIRECTED DNA POLYMERASE, a ribonuclease (RIBONUCLEASES), and DNA-DIRECTED RNA POLYMERASES to synthesize large quantities of sequence-specific RNA and DNA molecules.

The first nucleotide of a transcribed DNA sequence where RNA polymerase (DNA-DIRECTED RNA POLYMERASE) begins synthesizing the RNA transcript.

Information presented in graphic form, for example, graphs or diagrams.

The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms.

Graphs representing sets of measurable, non-covalent physical contacts with specific PROTEINS in living organisms or in cells.

Quick Search


DeepDyve research library

Relevant Topic

Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...

Searches Linking to this Article