Profiling the genome-wide landscape of tandem repeat expansions.

08:00 EDT 13th June 2019 | BioPortfolio

Summary of "Profiling the genome-wide landscape of tandem repeat expansions."

Tandem repeat (TR) expansions have been implicated in dozens of genetic diseases, including Huntington's Disease, Fragile X Syndrome, and hereditary ataxias. Furthermore, TRs have recently been implicated in a range of complex traits, including gene expression and cancer risk. While the human genome harbors hundreds of thousands of TRs, analysis of TR expansions has been mainly limited to known pathogenic loci. A major challenge is that expanded repeats are beyond the read length of most next-generation sequencing (NGS) datasets and are not profiled by existing genome-wide tools. We present GangSTR, a novel algorithm for genome-wide genotyping of both short and expanded TRs. GangSTR extracts information from paired-end reads into a unified model to estimate maximum likelihood TR lengths. We validate GangSTR on real and simulated data and show that GangSTR outperforms alternative methods in both accuracy and speed. We apply GangSTR to a deeply sequenced trio to profile the landscape of TR expansions in a healthy family and validate novel expansions using orthogonal technologies. Our analysis reveals that healthy individuals harbor dozens of long TR alleles not captured by current genome-wide methods. GangSTR will likely enable discovery of novel disease-associated variants not currently accessible from NGS.


Journal Details

This article was published in the following journal.

Name: Nucleic acids research
ISSN: 1362-4962


DeepDyve research library

PubMed Articles [10557 Associated PubMed Articles listed on BioPortfolio]

Phenotypic variability and neuropsychological findings associated with C9orf72 repeat expansions in a Bulgarian dementia cohort.

The GGGGCC repeat expansion in the C9orf72 gene was recently identified as a major cause of amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) in several European populations. The o...

Repeat Interruptions Modify Age at Onset in Myotonic Dystrophy Type 1 by Stabilizing Expansions in Somatic Cells.

CTG expansions in gene, causing myotonic dystrophy type 1 (DM1), are characterized by pronounced somatic instability. A large proportion of variability of somatic instability is explained by expansio...

Glutaminase Deficiency Caused by Short Tandem Repeat Expansion in .

We report an inborn error of metabolism caused by an expansion of a GCA-repeat tract in the 5' untranslated region of the gene encoding glutaminase () that was identified through detailed clinical and...

A Pipeline to Assess Disease-Associated Haplotypes in Repeat Expansion Disorders: The Example of MJD/SCA3 .

At least 40 human diseases are associated with repeat expansions; yet, the mutational origin and instability mechanisms remain unknown for most of them. Previously, genetic epidemiology and predisposi...

TaDa! Analysing cell type-specific chromatin in vivo with Targeted DamID.

The emergence of neuronal diversity during development of the nervous system relies on dynamic changes in the epigenetic landscape of neural stem cells and their progeny. Targeted DamID (TaDa) is prov...

Clinical Trials [2255 Associated Clinical Trials listed on BioPortfolio]

Comprehensive DNA Methylation Profiling in Crohn's Disease

Previous studies have indicated that abnormal DNA methylation frequently occurs in the mucosa in Crohn's disease. Comprehensive DNA methylation profiling of the inflamed and non-inflamed i...

Genome-Wide Association Study in Patients With Nontuberculous Mycobacterial Lung Disease

The aim of this study was to elucidate genetic susceptibility of patients with nontuberculous mycobacterial lung disease using genome-wide association study.

Fetal Genome Profiling Via Trophoblast Cells

The objective of this study is to utilize trophoblast cells accumulating in the endocervical canal at the beginning of pregnancy for non-invasive prenatal testing. If we are able to valida...

Tumor Landscape Pathological Diagnosis by Large Tissue Sections

The aim of this study is to establish large tissue sections for 10 kinds of tumors. in order to observe the tumor landscape on microscope. The tumors including esophageal carcinoma,gastric...

Next Generation pErsonalized tX(Therapy) With mulTi-omics and Preclinical Model]

The next generation of personalized medical treatment according to the type of personal genetic information are evolving rapidly. The genome analysis needs systematic infra and database ba...

Medical and Biotech [MESH] Definitions

Tandem arrays of moderately repetitive (5-50 repeats) short (10-60 bases) DNA sequences found dispersed throughout the genome and clustered near telomeres. Their degree of repetition is two to several hundred at each locus. Loci number in the thousands but each locus shows a distinctive repeat unit. Minisatellite repeats are often called variable number of tandem repeats.

Copies of DNA sequences which lie adjacent to each other in the same orientation (direct tandem repeats) or in the opposite direction to each other (INVERTED TANDEM REPEATS).

Sequences of DNA or RNA that occur in multiple copies. There are several types: INTERSPERSED REPETITIVE SEQUENCES are copies of transposable elements (DNA TRANSPOSABLE ELEMENTS or RETROELEMENTS) dispersed throughout the genome. TERMINAL REPEAT SEQUENCES flank both ends of another sequence, for example, the long terminal repeats (LTRs) on RETROVIRUSES. Variations may be direct repeats, those occurring in the same direction, or inverted repeats, those opposite to each other in direction. TANDEM REPEAT SEQUENCES are copies which lie adjacent to each other, direct or inverted (INVERTED REPEAT SEQUENCES).

Membrane glycosylphosphatidylinositol-anchored glycoproteins that may aggregate into rod-like structures. The prion protein (PRNP) gene is characterized by five TANDEM REPEAT SEQUENCES that encode a highly unstable protein region of five octapeptide repeats. Mutations in the repeat region and elsewhere in this gene are associated with CREUTZFELDT-JAKOB DISEASE; FATAL FAMILIAL INSOMNIA; GERSTMANN-STRAUSSLER DISEASE; Huntington disease-like 1, and KURU.

Copies of nucleic acid sequence that are arranged in opposing orientation. They may lie adjacent to each other (tandem) or be separated by some sequence that is not part of the repeat (hyphenated). They may be true palindromic repeats, i.e. read the same backwards as forward, or complementary which reads as the base complement in the opposite orientation. Complementary inverted repeats have the potential to form hairpin loop or stem-loop structures which results in cruciform structures (such as CRUCIFORM DNA) when the complementary inverted repeats occur in double stranded regions.

Quick Search


DeepDyve research library

Relevant Topics

Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...

Huntington's Disease
Huntington's disease is a hereditary disease caused by a defect in a single gene on Chromosome 4 that is inherited in an autosomal dominant fashion. The defect causes a part of DNA, called a CAG repeat, to occur many more times than it is supposed to...

Gene Expression
The process of gene expression is used by eukaryotes, prokaryotes, and viruses to generate the macromolecular machinery for life. Steps in the gene expression process may be modulated, including the transcription, RNA splicing, translation, and post-tran...

Searches Linking to this Article