Sign-constrained linear regression for prediction of microbe concentration based on water quality datasets.

08:00 EDT 1st June 2019 | BioPortfolio

Summary of "Sign-constrained linear regression for prediction of microbe concentration based on water quality datasets."

This study presents a novel methodology for estimating the concentration of environmental pollutants in water, such as pathogens, based on environmental parameters. The scientific uniqueness of this study is the prevention of excess conformity in the model fitting by applying domain knowledge, which is the accumulated scientific knowledge regarding the correlations between response and explanatory variables. Sign constraints were used to express domain knowledge, and the effect of the sign constraints on the prediction performance using censored datasets was investigated. As a result, we confirmed that sign constraints made prediction more accurate compared to conventional sign-free approaches. The most remarkable technical contribution of this study is the finding that the sign constraints can be incorporated in the estimation of the correlation coefficient in Tobit analysis. We developed effective and numerically stable algorithms for fitting a model to datasets under the sign constraints. This novel algorithm is applicable to a wide variety of the prediction of pollutant contamination level, including the pathogen concentrations in water.


Journal Details

This article was published in the following journal.

Name: Journal of water and health
ISSN: 1477-8920
Pages: 404-415


DeepDyve research library

PubMed Articles [25471 Associated PubMed Articles listed on BioPortfolio]

Drug sensitivity prediction with high-dimensional mixture regression.

This paper proposes a mixture regression model-based method for drug sensitivity prediction. The proposed method explicitly addresses two fundamental issues in drug sensitivity prediction, namely, pop...

Exploring the use of machine learning for risk adjustment: A comparison of standard and penalized linear regression models in predicting health care costs in older adults.

Payers and providers still primarily use ordinary least squares (OLS) to estimate expected economic and clinical outcomes for risk adjustment purposes. Penalized linear regression represents a practic...

Survival outcome prediction in cervical cancer: Cox models versus deep-learning model.

Historically, the Cox proportional hazard regression (CPH) model has been the mainstay for survival analyses in oncologic research. The CPH model is generally utilized based upon an assumption of line...

Predictive performance of regression models to estimate Chlorophyll-a concentration based on Landsat imagery.

Chlorophyll-a (Chl-a) concentration is a key parameter to describe water quality in marine and freshwater environments. Nowadays, several products with Chl-a have derived from satellite imagery, but t...

On the characterization of novel biologically active steroids: Selection of lipophilicity models of newly synthesized steroidal derivatives by classical and non-parametric ranking approaches.

In this paper, the guidelines for the interpretation of the results of quantitative structure-retention relationship (QSRR) modeling, comparison and assessment of the established models, as well as th...

Clinical Trials [6194 Associated Clinical Trials listed on BioPortfolio]

Primary Constrained Condylar Knee Arthroplasty Without Stem Extensions: Prevalence and Risk Factors

While performing a primary TKA in consecutive patients, a constrained insert may be necessary when adequate stability and soft tissue balance are not obtained. In this retrospective study,...

Validity of Aortic Pulse Wave Velocity in Predicting the 6- Minute Walking Test Before Major Non-cardiac Surgery

Methods: Prospective observational study in adult patients requiring preoperative evaluation Objectives: To determine the correlation between the aortic pulse wave velocity (AoPWV) and th...

Probiotic Effects on the Microbe-brain-gut Interaction and Brain Activity During Stress Tasks in Healthy Subjects

The aim of this study is to determine if and how the "Probiotic Product" affects functional brain responses in healthy subjects during an emotional- and arithmetic stress task, respectivel...

Free Text Prediction Algorithm for Appendicitis

Computer-aided diagnostic software has been used to assist physicians in various ways. Text-based prediction algorithms have been trained on past medical records through data mining and fe...

BRUSH Sign: Radiolographic Marker of Cerebral Infarctus Prognosis

Today the treatment of ischemic stroke in acute phase is based on medicinal or endovascular revascularization. Cerebral MRI sequences help the diagnostic. This procedure uses deoxyhemoglob...

Medical and Biotech [MESH] Definitions

Procedures for finding the mathematical function which best describes the relationship between a dependent variable and one or more independent variables. In linear regression (see LINEAR MODELS) the relationship is constrained to be a straight line and LEAST-SQUARES ANALYSIS is used to determine the best fit. In logistic regression (see LOGISTIC MODELS) the dependent variable is qualitative rather than continuously variable and LIKELIHOOD FUNCTIONS are used to find the best relationship. In multiple regression, the dependent variable is considered to depend on more than a single independent variable.

Statistical models in which the value of a parameter for a given value of a factor is assumed to be equal to a + bx, where a and b are constants. The models predict a linear regression.

A method where a culturing surface inoculated with microbe is exposed to small disks containing known amounts of a chemical agent resulting in a zone of inhibition (usually in millimeters) of growth of the microbe corresponding to the susceptibility of the strain to the agent.

A prediction of the probable outcome of a disease based on a individual's condition and the usual course of the disease as seen in similar situations.

The statistical manipulation of hierarchically and non-hierarchically nested data. It includes clustered data, such as a sample of subjects within a group of schools. Prevalent in the social, behavioral sciences, and biomedical sciences, both linear and nonlinear regression models are applied.

Quick Search


DeepDyve research library

Relevant Topic

Food is any substance consumed to provide nutritional support for the body. It is usually of plant or animal origin, and contains essential nutrients, such as carbohydrates, fats, proteins, vitamins, or minerals. The substance is ingested by an organism ...

Searches Linking to this Article