A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification.

08:00 EDT 10th August 2018 | BioPortfolio

Summary of "A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification."

Deep Convolutional Neural Networks (CNNs) have shown superior performance on the task of single-label image classification. However, the applicability of CNNs to multilabel images still remains an open problem, mainly because of two reasons. First, each image is usually treated as an inseparable entity and represented as one instance, which mixes the visual information corresponding to different labels. Second, the correlations amongst labels are often overlooked. To address these limitations, we propose a deep Multi-Modal CNN for Multi-Instance Multi-Label image classification, called MMCNNMIML. By combining CNNs with Multi-Instance Multi-Label (MIML) learning, our model represents each image as a bag of instances for image classification and inherits the merits of both CNNs and MIML. In particular, MMCNN-MIML has three main appealing properties: i) It can automatically generate instance representations for MIML by exploiting the architecture of CNNs. ii) It takes advantage of the label correlations by grouping labels in its later layers. iii) It incorporates the textual context of label groups to generate multi-modal instances, which are effective in discriminating visually similar objects belonging to different groups. Empirical studies on several benchmark multilabel image datasets show that MMCNN-MIML significantly outperforms the state-of-the-art baselines on multi-label image classification tasks.


Journal Details

This article was published in the following journal.

Name: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
ISSN: 1941-0042


DeepDyve research library

PubMed Articles [9401 Associated PubMed Articles listed on BioPortfolio]

Prevalence and characteristics of multi-modal hallucinations in people with psychosis who experience visual hallucinations.

Hallucinations can occur in single or multiple sensory modalities. Historically, greater attention has been paid to single sensory modality experiences with a comparative neglect of hallucinations tha...

Cost-sensitive multi-label learning with positive and negative label pairwise correlations.

Multi-label learning is the problem where each instance is associated with multiple labels simultaneously. Binary Relevance (BR) is a representative algorithm for multi-label learning. However, it may...

HyperDense-Net: A hyper-densely connected CNN for multi-modal image segmentation.

Recently, dense connections have attracted substantial attention in computer vision because they facilitate gradient flow and implicit deep supervision during training. Particularly, DenseNet, which c...

AnnoFly: Annotating Drosophila Embryonic Images Based on an Attention-Enhanced RNN Model.

In the post-genomic era, image-based transcriptomics have received huge attention, because the visualization of gene expression distribution is able to reveal spatial and temporal expression pattern, ...

Multi-Domain & Multi-Task Learning for Human Action Recognition.

Domain-invariant (view-invariant & modalityinvariant) feature representation is essential for human action recognition. Moreover, given a discriminative visual representation, it is critical to discov...

Clinical Trials [9054 Associated Clinical Trials listed on BioPortfolio]

Design of a Non-invasive Multi-modal Neonatal Monitoring System

This is a preliminary study whose objectives are to define the clinical use cases and the constraints of the implementation of a multi-sensor image-sound system.

Improving Pain and Reducing Opioid Use (IPaRO) in Lumbar Spine Surgery Patients

Patients presenting for lumbar spine surgery experience pain related to their spine condition. Following surgery, these patients also experience surgical pain resulting from disruption of ...

MAST Trial: Multi-modal Analgesic Strategies in Trauma

This is a comparative effectiveness study of current pain management strategies in acutely injured trauma patients. Two different multi-modal, opioid minimizing analgesic strategies will b...

Treating Mothers First

We hypothesize that successfully treating maternal Attention Deficit Hyperactivity Disorder (ADHD) will have a beneficial effect that extends to the child. We believe that multi-component...

Multi-Target Pallidal and Thalamic Deep Brain Stimulation for Hemi-Dystonia

Dystonia is increasingly being considered as a multi-nodal network disorder involving both basal ganglia and cerebellar dysfunction. The aim of this study is to determine if "Multi-Target"...

Medical and Biotech [MESH] Definitions

A broad category of multi-ingredient preparations that are marketed for the relief of upper respiratory symptoms resulting from the COMMON COLD; ALLERGIES; or HUMAN INFLUENZA. While the majority of these medications are available as OVER-THE-COUNTER DRUGS some of them contain ingredients that require them to be sold as PRESCRIPTION DRUGS or as BEHIND-THE COUNTER DRUGS.

A subgenus of LENTIVIRUS comprising viruses that produce multi-organ disease with long incubation periods in cats.

A subgenus of LENTIVIRUS comprising viruses that produce multi-organ disease with long incubation periods in cattle.

A subgenus of LENTIVIRUS comprising viruses that produce multi-organ disease with long incubation periods in horses.

A subgenus of LENTIVIRUS comprising viruses that produce multi-organ disease with long incubation periods in sheep and goats.

Quick Search


DeepDyve research library

Relevant Topic

Antiretroviral therapy
Standard antiretroviral therapy (ART) consists of the combination of at least three antiretroviral (ARV) drugs to maximally suppress the HIV virus and stop the progression of HIV disease. Huge reductions have been seen in rates of death and suffering whe...

Searches Linking to this Article