Track topics on Twitter Track topics that are important to you
Binding prediction between targets and drug-like compounds through Deep Neural Networks have generated promising results in recent years, outperforming traditional machine learning-based methods. However, the generalization capability of these classification models is still an issue to be addressed. In this work, we explored how different cross-validation strategies applied to data from different molecular databases affect to the performance of binding prediction proteochemometrics models. These strategies are: (1) random splitting, (2) splitting based on K-means clustering (both of actives and inactives), (3) splitting based on source database and (4) splitting based both in the clustering and in the source database. These schemas are applied to a Deep Learning proteochemometrics model and to a simple logistic regression model to be used as baseline. Additionally, two different ways of describing molecules in the model are tested: (1) by their SMILES and (2) by three fingerprints. The classification performance of our Deep Learning-based proteochemometrics model is comparable to the state of the art. Our results show that the lack of generalization of these models is due to a bias in public molecular databases and that a restrictive cross-validation schema based on compounds clustering leads to worse but more robust and credible results. Our results also show better performance when representing molecules by their fingerprints.
This article was published in the following journal.
Name: Journal of chemical information and modeling
Lantibiotics, an important group of ribosomally synthesized peptides, represent an important arsenal of novel promising antimicrobials showing high potency in fighting against the prevalence of antibi...
Accurately targeting metal ion-binding sites solely from protein sequences is valuable for both basic experimental biology and drug discovery studies. Although considerable progress has been made, met...
Understanding the mechanisms involved in the activation of an immune response is essential to many fields in human health, including vaccine development and personalized cancer immunotherapy. A centra...
Genotoxicity evaluation has been widely used to estimate the carcinogenicity of test substances during safety evaluation. However, the latest strategies using genotoxicity tests give more weight to se...
Supervised machine learning techniques have traditionally been very successful at reconstructing biological networks, such as protein-ligand interaction, protein-protein interaction and gene regulator...
The main objective of this study is to achieve cross-cultural and psychometric validation of the Xerostomia Inventory initially developed in English language into French Language. This wil...
Rapid sequence induction (RSI) is a common part of routine anesthesiology practice. However several steps of RSI are not based on evidence based data (EBM) and are considered controversial...
This study aims to assess the diagnostic validity of a new minute-MRI sequence for neuroradiological evaluation in comparison to conventional MRI.
The discovery of biomarkers for the intake of meats and potatoes is needed for an accurate assessment of the intake of these foods. Twelve healthy subjects were enrolled in a controlled, c...
This study will determine if the replacement of the measured arterial blood oxygen saturation with expired (end-tidal) oxygen value is an acceptable method to calculate the accuracy of pul...
A prediction of the probable outcome of a disease based on a individual's condition and the usual course of the disease as seen in similar situations.
Validation of the sex of an individual by means of the bones of the SKELETON. It is most commonly based on the appearance of the PELVIS; SKULL; STERNUM; and/or long bones.
A subfamily of transmembrane proteins from the superfamily of ATP-BINDING CASSETTE TRANSPORTERS that are closely related in sequence to ATP-BINDING CASSETTE, SUB-FAMILY B, MEMBER 1. When overexpressed, they function as ATP-dependent efflux pumps able to extrude lipophilic drugs, especially ANTINEOPLASTIC AGENTS, from cells causing multidrug resistance (DRUG RESISTANCE, MULTIPLE). Although ATP BINDING CASSETTE TRANSPORTER, SUB-FAMILY B share functional similarities to MULTIDRUG RESISTANCE-ASSOCIATED PROTEINS they are two distinct subclasses of ATP-BINDING CASSETTE TRANSPORTERS, and have little sequence homology.
The prediction or projection of the nature of future problems or existing conditions based upon the extrapolation or interpretation of existing scientific data or by the application of scientific methodology.
Predicting the time of OVULATION can be achieved by measuring the preovulatory elevation of ESTRADIOL; LUTEINIZING HORMONE or other hormones in BLOOD or URINE. Accuracy of ovulation prediction depends on the completeness of the hormone profiles, and the ability to determine the preovulatory LH peak.
Clinical Approvals Clinical Trials Drug Approvals Drug Delivery Drug Discovery Generics Drugs Prescription Drugs In the fields of medicine, biotechnology and pharmacology, drug discovery is the process by which drugs are dis...
Standard antiretroviral therapy (ART) consists of the combination of at least three antiretroviral (ARV) drugs to maximally suppress the HIV virus and stop the progression of HIV disease. Huge reductions have been seen in rates of death and suffering whe...