Track topics on Twitter Track topics that are important to you
Promoter is a fundamental DNA element located around the transcription start site (TSS) and could regulate gene transcription. Promoter recognition is of great significance in determining transcription units, studying gene structure, analyzing gene regulation mechanisms, and annotating gene functional information. Many models have already been proposed to predict promoters. However, the performances of these methods still need to be improved. In this work, we combined pseudo k-tuple nucleotide composition (PseKNC) with position-correlation scoring function (PCSF) to formulate promoter sequences of Homo sapiens (H. sapiens), Drosophila melanogaster (D. melanogaster), Caenorhabditis elegans (C. elegans), Bacillus subtilis (B. subtilis), and Escherichia coli (E. coli). Minimum Redundancy Maximum Relevance (mRMR) algorithm and increment feature selection strategy were then adopted to find out optimal feature subsets. Support vector machine (SVM) was used to distinguish between promoters and non-promoters. In the 10-fold cross-validation test, accuracies of 93.3%, 93.9%, 95.7%, 95.2%, and 93.1% were obtained for H. sapiens, D. melanogaster, C. elegans, B. subtilis, and E. coli, with the areas under receiver operating curves (AUCs) of 0.974, 0.975, 0.981, 0.988, and 0.976, respectively. Comparative results demonstrated that our method outperforms existing methods for identifying promoters. An online web server was established that can be freely accessed (http://lin-group.cn/server/iProEP/).
This article was published in the following journal.
Name: Molecular therapy. Nucleic acids
Computational identification of promoters is notoriously difficult as human genes often have unique promoter sequences that provide regulation of transcription and interaction with transcription initi...
Accurate identification of intrinsically disordered proteins/regions (IDPs/IDRs) is critical for predicting protein structure and function. Previous studies have shown that IDRs of different lengths h...
The promoter methylation status of the O-methylguanine-DNA methyltransferase (MGMT) gene has been described as the most important predictor of chemotherapeutic response and patients' survival in gliob...
Specificity is one of the most important and complex properties that is central to both natural antibody function and therapeutic antibody efficacy. However, it has proven extremely challenging to def...
It is widely acknowledged that the predictive performance of clinical prediction models should be studied in patients that were not part of the data in which the model was derived. Out-of-sample perfo...
Computational simulation will be performed to represent motion of knees with a dislocating kneecap. Common surgical treatment methods will be simulated and anatomical parameters commonly a...
CENTRIC is a Phase III clinical trial assessing efficacy and safety of the investigational integrin inhibitor, cilengitide, in combination with standard treatment versus standard treatment...
A two-part molecular epidemiological study will be conducted to comprehensively assess the association between miR expression and miR promoter methylation and the response to therapy and p...
The aim of this study is to explore possible predicting factors associated with physical activity (PA) level change in a 6-month period of physical activity on prescription (PAP) treatment...
CORE is a Phase II clinical trial in newly diagnosed glioblastoma multiforme (GBM) in patients with an unmethylated promoter of the methylguanine-DNA methyltransferase (MGMT) gene in the t...
DNA sequences which are recognized (directly or indirectly) and bound by a DNA-dependent RNA polymerase during the initiation of transcription. Highly conserved sequences within the promoter include the Pribnow box in bacteria and the TATA BOX in eukaryotes.
Genes whose expression is easily detectable and therefore used to study promoter activity at many positions in a target genome. In recombinant DNA technology, these genes may be attached to a promoter region of interest.
Promoter-specific RNA polymerase II transcription factor that binds to the GC box, one of the upstream promoter elements, in mammalian cells. The binding of Sp1 is necessary for the initiation of transcription in the promoters of a variety of cellular and viral GENES.
Models connecting initiating events at the cellular and molecular level to population-wide impacts. Computational models may be at levels relating toxicology to adverse effects.
Comparison of the BLOOD PRESSURE between the BRACHIAL ARTERY and the POSTERIOR TIBIAL ARTERY. It is a predictor of PERIPHERAL ARTERIAL DISEASE.
Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...