Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions.

07:00 EST 13th February 2020 | BioPortfolio

Summary of "Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions."

Un-annotated gene sequences in databases are increasing due to sequencing advances. Therefore, computational methods to predict functions of un-annotated genes are needed. Moreover, novel enzyme discovery for metabolic engineering applications further encourages annotation of sequences. Here, enzyme functions are predicted using two general approaches, each including several machine learning algorithms. First, Enzyme-models (E-models) predict Enzyme Commission (EC) numbers from amino acid sequence information. Second, Substrate-Enzyme models (SE-models) are built to predict substrates of enzymatic reactions together with EC numbers, and Substrate-Enzyme-Product models (SEP-models) are built to predict substrates, products and EC numbers. While accuracy of E-models is not optimal, SE-models and SEP-models predict EC numbers and reactions with high accuracy using all tested machine learning-based methods. For example, a single Random Forests-based SEP-model predicts EC first digits with an Average AUC score of over 0.94. Various metrics indicate that the current strategy of combining sequence and chemical structure information is effective at improving enzyme reaction prediction.


Journal Details

This article was published in the following journal.

Name: Journal of chemical information and modeling
ISSN: 1549-960X


DeepDyve research library

PubMed Articles [29973 Associated PubMed Articles listed on BioPortfolio]

Machine learning in predicting graft failure following kidney transplantation: A systematic review of published predictive models.

Machine learning has been increasingly used to develop predictive models to diagnose different disease conditions. The heterogeneity of the kidney transplant population makes predicting graft outcomes...

The exploration of feature extraction and machine learning for predicting bone density from simple spine X-ray images in a Korean population.

Osteoporosis is hard to detect before it manifests symptoms and complications. In this study, we evaluated machine learning models for identifying individuals with abnormal bone mineral density (BMD) ...

Blending Machine Learning and Interaction Design in Audio Explorer.

The results of machine learning models can often be difficult to interpret, especially for domain experts. Audio Explorer, the winning entry of the 2018 VAST Challenge, is an interactive data explorat...

Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database.

Postoperative gastrointestinal leak and venous thromboembolism (VTE) are devastating complications of bariatric surgery. The performance of currently available predictive models for these complication...

Machine Learning Models for Accurate Prediction of Kinase Inhibitors with Different Binding Modes.

Noncovalent inhibitors of protein kinases have different modes of action. They bind to the active or inactive form of kinases, compete with ATP, stabilize inactive kinase conformations, or act through...

Clinical Trials [10182 Associated Clinical Trials listed on BioPortfolio]

Machine Learning From Fetal Flow Waveforms to Predict Adverse Perinatal Outcomes

The aim of this study is to get a proof of concept for using a computational model of fetal haemodynamics, combined with machine learning based on Doppler patterns of the fetal cardiovascu...

Identification of Patients Admitted With COPD Exacerbations and Predicting Readmission Risk Using Machine Learning

Patients with Chronic Obstructive Pulmonary Disease (COPD) who are admitted to hospital are at high risk of readmission. While therapies have improved and there are evidence-based guidelin...

Personalizing Mediterranean Diet in Children.

Investigating glucose response to Mediterranean and regular diets in healthy children in order to develop specific pediatric machine-learning for predicting the personalized glucose respon...

Prediction of Kidney Injury After Hyperthermic Intraperitoneal Chemotherapy (HIPEC)* With Machine Learning

Patients undergoing cytoreductive surgery with hyperthermic intraoperative chemotherapy (CRS with HIPEC) are prone to postoperative kidney dysfunction. Previous models predicting kidney in...

Physiological Validation of Current Machine Learning Models for Hemodynamic Instability in Humans

This study will be collecting data on participants undergoing lower body negative pressure (LBNP) to simulate progressive blood loss. The goal of the study is to collect data to allow for ...

Medical and Biotech [MESH] Definitions

A MACHINE LEARNING paradigm used to make predictions about future instances based on a given set of labeled paired input-output training (sample) data.

A MACHINE LEARNING paradigm used to make predictions about future instances based on a given set of unlabeled paired input-output training (sample) data.

SUPERVISED MACHINE LEARNING algorithm which learns to assign labels to objects from a set of training examples. Examples are learning to recognize fraudulent credit card activity by examining hundreds or thousands of fraudulent and non-fraudulent credit card activity, or learning to make disease diagnosis or prognosis based on automatic classification of microarray gene expression profiles drawn from hundreds or thousands of samples.

Usually refers to the use of mathematical models in the prediction of learning to perform tasks based on the theory of probability applied to responses; it may also refer to the frequency of occurrence of the responses observed in the particular study.

A type of ARTIFICIAL INTELLIGENCE that enable COMPUTERS to independently initiate and execute LEARNING when exposed to new data.

Quick Search

DeepDyve research library

Relevant Topics

Bioinformatics is the application of computer software and hardware to the management of biological data to create useful information. Computers are used to gather, store, analyze and integrate biological and genetic information which can then be applied...

DNA sequencing
DNA sequencing is the process of determining the precise order of nucleotides within a DNA molecule. During DNA sequencing, the bases of a small fragment of DNA are sequentially identified from signals emitted as each fragment is re-synthesized from a ...

Enzymes are proteins that catalyze (i.e., increase the rates of) chemical reactions. In enzymatic reactions, the molecules at the beginning of the process, called substrates, are converted into different molecules, called products. Almost all chemical re...

Searches Linking to this Article