Equivalence between dropout and data augmentation: A mathematical check.

08:00 EDT 27th March 2019 | BioPortfolio

Summary of "Equivalence between dropout and data augmentation: A mathematical check."

The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network's generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is replaced by some newly proposed activation function whose value lies in R, then our results can be extended to multilayer neural networks. For comparison, some counterexamples are given for the inequivalent case. Finally, a series of experiments on the MNIST dataset are conducted to illustrate and help understand the theoretical results.


Journal Details

This article was published in the following journal.

Name: Neural networks : the official journal of the International Neural Network Society
ISSN: 1879-2782
Pages: 82-89


DeepDyve research library

PubMed Articles [11125 Associated PubMed Articles listed on BioPortfolio]

Dynamic estimation in the extended marginal Rasch model with an application to mathematical computer-adaptive practice.

We introduce a general response model that allows for several simple restrictions, resulting in other models such as the extended Rasch model. For the extended Rasch model, a dynamic Bayesian estimati...

Equivalence limit scaled differences for untargeted safety assessments: Comparative analyses to guard against unintended effects on the environment or human health of genetically modified maize.

Safety assessments guard against unintended effects for human health and the environment. When new products are compared with accepted reference products by broad arrays of measurements, statistical a...

Identification of dropout predictors to a community-based physical activity programme that uses motivational interviewing.

Participant dropout reduces intervention effectiveness. Predicting dropout has been investigated for Exercise Referral Schemes, but not physical activity (PA) interventions with Motivational Interview...

A Kantian account of mathematical modelling and the rationality of scientific theory change: The role of the equivalence principle in the development of general relativity.

Integrating Clinical and Epidemiological Data on Allergic Diseases Across Birth Cohorts: a MeDALL Harmonization Study.

International collaborations among birth cohorts to better understand asthma and allergies have increased in the last years. However, differences in definitions and methods preclude direct pooling of ...

Clinical Trials [3332 Associated Clinical Trials listed on BioPortfolio]

Substantial Equivalence Study for Kai Sensors RSpot Non-Contact Respiratory Rate Spot Check

The purpose of this study is to determine whether the respiratory rate provided by the Kai Sensors RSpot 100 Non-Contact Respiratory Rate Spot Check is as accurate as that provided by the ...

China Health Big Data

In this nationwide multi center study the investigators combine the low dose chest CT scan data with QCT technology, to measure the BMD of spine, VAT and liver fat in the health check subj...

Mathematical Modeling Analysis of Serum Prostate Specific Antigen After Radiotherapy

Clinical data (prostate-specific antigen [PSA] response after radiotherapy) are being used to build a mathematical model to describe the clinical results of radiotherapy for prostate cance...

Equivalence of Generic Clozapine to Orally Dissolving Clozapine in Schizophrenia or Schizoaffective Disorder

The purpose of this study is to obtain data on equivalence of generic clozapine to Fazaclo (orally disintegrating tablet). Generic clozapine is the most frequently used clozapine and such ...

Proof of Concept - Identification of Patient-specific Parameters for Bolus Calculators for Type 1 Diabetes

The adoption of bolus calculators has been limited by the slow speed of the current trial and error approach. The goal of this project is to automate the determination of patient specific ...

Medical and Biotech [MESH] Definitions

The determination of the concentration of a given component in solution (the analyte) by addition of a liquid reagent of known strength (the titrant) until an equivalence point is reached (when the reactants are present in stoichiometric proportions). Often an indicator is added to make the equivalence point visible (e.g., a change in color).

Signal and data processing method that uses decomposition of wavelets to approximate, estimate, or compress signals with finite time and frequency domains. It represents a signal or data in terms of a fast decaying wavelet series from the original prototype wavelet, called the mother wavelet. This mathematical algorithm has been adopted widely in biomedical disciplines for data and signal processing in noise removal and audio/image compression (e.g., EEG and MRI).

Information application based on a variety of coding methods to minimize the amount of data to be stored, retrieved, or transmitted. Data compression can be applied to various forms of data, such as images and signals. It is used to reduce costs and increase efficiency in the maintenance of large volumes of data.

Computer-assisted interpretation and analysis of various mathematical functions related to a particular problem.

Various units or machines that operate in combination or in conjunction with a computer but are not physically part of it. Peripheral devices typically display computer data, store data from the computer and return the data to the computer on demand, prepare data for human use, or acquire data from a source and convert it to a form usable by a computer. (Computer Dictionary, 4th ed.)

Quick Search


DeepDyve research library

Searches Linking to this Article