Track topics on Twitter Track topics that are important to you
Attributes are semantically meaningful characteristics whose applicability widely crosses category boundaries. They are particularly important in describing and recognizing concepts for which no explicit training example is given, e.g., zero-shot learning. Additionally, since attributes are human describable, they can be used for efficient human-computer interaction. In this paper, we propose to employ semantic segmentation to improve person-related attribute prediction. The core idea lies in the fact that many attributes describe local properties. In other words, the probability of an attribute to appear in an image is far from being uniform in the spatial domain. We build our attribute prediction model jointly with a deep semantic segmentation network. This harnesses the localization cues learned by the semantic segmentation to guide the attention of the attribute prediction to the regions where different attributes naturally show up. As a result of this approach, in addition to prediction, we are able to localize the attributes despite merely having access to image-level labels (weak supervision) during training. We first propose semantic segmentation-based pooling and gating, respectively denoted as SSP and SSG. In the former, the estimated segmentation masks are used to pool the final activations of the attribute prediction network, from multiple semantically homogeneous regions. This is in contrast to global average pooling which is agnostic with respect to where in the spatial domain activations occur. In SSG, the same idea is applied to the intermediate layers of the network. Specifically, we create multiple copies of the internal activations. In each copy, only values that fall within a certain semantic region are preserved while outside of that, activations are suppressed. This mechanism allows us to prevent pooling operation from blending activations that are associated with semantically different regions. SSP and SSG, while effective, impose heavy memory utilization since each channel of the activations is pooled/gated with all the semantic segmentation masks. To circumvent this, we propose Symbiotic Augmentation (SA), where we learn only one mask per activation channel. SA allows the model to either pick one, or combine (weighted superposition) multiple semantic maps, in order to generate the proper mask for each channel. SA simultaneously applies the same mechanism to the reverse problem by leveraging output logits of attribute prediction to guide the semantic segmentation task. We evaluate our proposed methods for facial attributes on CelebA and LFWA datasets, while benchmarking WIDER Attribute and Berkeley Attributes of People for whole body attributes. Our proposed methods achieve superior results compared to the previous works. Furthermore, we show that in the reverse problem, semantic face parsing significantly improves when its associated task is jointly learned, through our proposed Symbiotic Augmentation, with facial attribute prediction. We confirm that when few training instances are available, indeed image-level facial attribute labels can serve as an effective source of weak supervision to improve semantic face parsing. That reaffirms the need to jointly model these two interconnected tasks.
This article was published in the following journal.
Name: IEEE transactions on pattern analysis and machine intelligence
Semantic image segmentation is an important yet unsolved problem. One of the major challenges is the large variability of the object scales. To tackle this scale problem, we propose a Scale-Adaptive N...
Deep neural network-based semantic segmentation generally requires large-scale cost extensive annotations for training to obtain better performance. To avoid pixel-wise segmentation annotations which ...
Automatic and accurate 3D segmentation of liver with severe diseases from computed tomography (CT) images is a challenging task. Fully convolutional networks (FCN) have emerged as powerful tools for a...
We propose Mask SSD, an efficient and effective approach to address the challenging instance segmentation task. Based on a single-shot detector, Mask SSD detects all instances in an image and marks th...
Semantic variant of primary progressive aphasia (svPPA) is a subtype of frontotemporal dementia characterized by asymmetric temporal atrophy.
Accurate segmentation of lung tumor is essential for treatment planning, as well as for monitoring response to therapy. It is well-known that segmentation of the lung tumour by different r...
Cone Beam Computed Tomography (CBCT) has been used to assess the volume of the maxillary sinus using the manual and semi-automatic segmentation. The majority of researches stressed on the ...
Limited pancreatic resections are increasingly performed, but the rate of postoperative fistula is higher than after classical resections. Pancreatic segmentation, anatomically and radiolo...
Bipolar disorder is a mental disease characterized by mood dysregulation and arises from manic, depressive or mixed episodes. The observations of the patient's speech and language behaviou...
Because the diagnostic criteria for prostate cancer are different in the peripheral and the transition zone, prostate segmentation is needed for any computer-aided diagnosis system aimed a...
The perceived attribute of a sound which corresponds to the physical attribute of intensity.
A framework for development and promotion of common data formats and exchange protocols linked in a way that can be read directly by computers. Semantic Web is a platform for sharing and reusing data across application, enterprise, and community boundaries, by linking concepts rather than just documents.
Predicting the time of OVULATION can be achieved by measuring the preovulatory elevation of ESTRADIOL; LUTEINIZING HORMONE or other hormones in BLOOD or URINE. Accuracy of ovulation prediction depends on the completeness of the hormone profiles, and the ability to determine the preovulatory LH peak.
Distinct units in some bacterial, bacteriophage or plasmid GENOMES that are types of MOBILE GENETIC ELEMENTS. Encoded in them are a variety of fitness conferring genes, such as VIRULENCE FACTORS (in "pathogenicity islands or islets"), ANTIBIOTIC RESISTANCE genes, or genes required for SYMBIOSIS (in "symbiosis islands or islets"). They range in size from 10 - 500 kilobases, and their GC CONTENT and CODON usage differ from the rest of the genome. They typically contain an INTEGRASE gene, although in some cases this gene has been deleted resulting in "anchored genomic islands".
Studies which start with the identification of persons with a disease of interest and a control (comparison, referent) group without the disease. The relationship of an attribute to the disease is examined by comparing diseased and non-diseased persons with regard to the frequency or levels of the attribute in each group.