Development of new methods needs proper evaluation - benchmarking sets for machine learning experiments for class A GPCRs.

08:00 EDT 11th October 2019 | BioPortfolio

Summary of "Development of new methods needs proper evaluation - benchmarking sets for machine learning experiments for class A GPCRs."

New computational approaches for virtual screening applications are constantly being developed. However, before a particular tool is used to search for new active compounds, its effectiveness in the type of task must be examined. In this study, we conducted a detailed analysis of various aspects of preparation of respective datasets for such an evaluation. We propose a protocol for fetching data from the ChEMBL database, examine various compounds representations in terms of the possible bias resulting from the way they are generated and define a new metric for comparing the structural similarity of compounds, which is in line with chemical intuition. The newly developed method is also used for the evaluation of various approaches for division of the dataset into training and test set parts, which are also examined in detail in terms of being the source of possible results bias. Finally, machine learning methods are applied in cross-validation studies of datasets constructed within the paper, constituting benchmarks for the assessment of computational methods developed for virtual screening tasks. Additionally, analogous datasets for class A G protein-coupled receptors (100 targets with the highest number of records) were prepared. It is available at, together with script enabling reproduction of all results available at


Journal Details

This article was published in the following journal.

Name: Journal of chemical information and modeling
ISSN: 1549-960X


DeepDyve research library

PubMed Articles [23300 Associated PubMed Articles listed on BioPortfolio]

Nanoinformatics, and the big challenges for the science of small things.

The combination of computational chemistry and computational materials science with machine learning and artificial intelligence provides a powerful way of relating structural features of nanomaterial...

On Benchmarking of Automated Methods for Performing Exhaustive Reaction Path Search.

In recent years, the importance of computational chemistry approaches has grown rapidly because of recent advances in computational software and hardware. Automated reaction path search is one of prom...

Predicting early risk of chronic kidney disease in cats using routine clinical laboratory tests and machine learning.

Advanced machine learning methods combined with large sets of health screening data provide opportunities for diagnostic value in human and veterinary medicine.

Comprehensively benchmarking applications for detecting copy number variation.

Recently, copy number variation (CNV) has gained considerable interest as a type of genomic variation that plays an important role in complex phenotypes and disease susceptibility. Since a number of C...

Phosphorescent Material Search Using a Combination of High-Throughput Evaluation and Machine Learning.

High-throughput experiments including combinatorial chemistry are useful for generating large amounts of data within a short period of time. Machine learning can be used to predict the regularity of a...

Clinical Trials [7363 Associated Clinical Trials listed on BioPortfolio]

OPtimal Type 2 dIabetes Management Including Benchmarking and Standard trEatment.

Demonstrate that the use of benchmarking improves quality of patient care, in particular the control of diabetes, lipids and blood pressure, by determining the percentage of patients in th...

Evaluation of Domestic Hemodialysis Machine: A Multi-center Clinical Study

An important reason for the costs of hemodialysis treatment in China are expensive is the hemodialysis machine and related products mainly rely on imports. Hemodialysis machine is the basi...

Pain Evaluation and Treatment in the Emergency Department

All patients admitted in Geneva University Hospitals (GUH) emergency department (ED) are triaged using the Swiss Emergency Triage Scale (SETS), a 4-level symptom-based triage scale. At the...

Machine Learning-Based Risk Profile Classification of Patients Undergoing Elective Heart Valve Surgery

Machine learning methods potentially provide a highly accurate and detailed assessment of expected individual patient risk before elective cardiac surgery. Correct anticipation of this ris...

Safety, Effectiveness and Manipulability Evaluation of a Domestic PD Machine

This study is a randomized, multi-center,crossover study of a domestic FM peritoneal dialysis machine and Baxter HOMECHOICE.It aims to verify safety, effectiveness and manipulability of ...

Medical and Biotech [MESH] Definitions

Apparatus that provides mechanical circulatory support during open-heart surgery, by passing the heart to facilitate surgery on the organ. The basic function of the machine is to oxygenate the body's venous supply of blood and then pump it back into the arterial system. The machine also provides intracardiac suction, filtration, and temperature control. Some of the more important components of these machines include pumps, oxygenators, temperature regulators, and filters. (UMDNS, 1999)

A system in which the functions of the man and the machine are interrelated and necessary for the operation of the system.

A MACHINE LEARNING paradigm used to make predictions about future instances based on a given set of labeled paired input-output training (sample) data.

A MACHINE LEARNING paradigm used to make predictions about future instances based on a given set of unlabeled paired input-output training (sample) data.

Method of measuring performance against established standards of best practice.

Quick Search

DeepDyve research library

Searches Linking to this Article