Topics

PubMed Journal Database | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society RSS

00:06 EST 24th February 2020 | BioPortfolio

The US National Library of Medicine and National Institutes of Health manage PubMed.gov which comprises of more than 29 million records, papers, reports for biomedical literature, including MEDLINE, life science and medical journals, articles, reviews, reports and  books.

BioPortfolio aims to cross reference relevant information on published papers, clinical trials and news associated with selected topics - speciality.

For example view all recent relevant publications on Epigenetics and associated publications and clincial trials.

Showing PubMed Articles 1–25 of 914 from IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

Multi-scale Temporal Cues Learning for Video Person Re-Identification.

Temporal cues embedded in videos provide important clues for person Re-Identification (ReID). To efficiently exploit temporal cues with a compact neural network, this work proposes a novel 3D convolution layer called Multi-scale 3D (M3D) convolution layer. The M3D layer is easy to implement and could be inserted into traditional 2D convolution networks to learn multi-scale temporal cues by end-to-end training. According to its inserted location, the M3D layer has two variants, i.e., local M3D layer and glob...

A Novel Saliency Detection Algorithm Based On Adversarial Learning Model.

The traditional salient object detection models can be divided into several classes based on the low-level features of images and contrast between the pixels. This paper proposes an adversarial learning model (ALM) that includes the generative model and discriminative model. The ALM uses the original image as an input of the generative model to extract the high-level features and forms an initial salient map. Then, the discriminative model is utilized to compare differences in the features between the initi...

Unsupervised Multi-Target Domain Adaptation: An Information Theoretic Approach.

Unsupervised domain adaptation (uDA) models focus on pairwise adaptation settings where there is a single, labeled, source and a single target domain. However, in many real-world settings one seeks to adapt to multiple, but somewhat similar, target domains. Applying pairwise adaptation approaches to this setting may be suboptimal, as they fail to leverage shared information among multiple domains. In this work, we propose an information theoretic approach for domain adaptation in the novel context of multip...

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition.

With the prevalence of RGB-D cameras, multimodal video data have become more available for human action recognition. One main challenge for this task lies in how to effectively leverage their complementary information. In this work, we propose a Modality Compensation Network (MCN) to explore the relationships of different modalities, and boost the representations for human action recognition. We regard RGB/ optical flow videos as source modalities, skeletons as auxiliary modality. Our goal is to extract mor...

Second-order Spectral Transform Block for 3D Shape Classification and Retrieval.

In this paper, we propose a novel network block, dubbed as second-order spectral transform block, for 3D shape retrieval and classification. This network block generalizes the second-order pooling to 3D surface by designing a learnable non-linear transform on the spectrum of the pooled descriptor. The proposed block consists of following two components. First, the second-order average (SO-Avr) and max-pooling (SOMax) operations are designed on 3D surface to aggregate local descriptors, which are shown to be...

Autonomous Selective Parts-Based Tracking.

Object tracking from videos is still a challenging task due to various changes throughout a video sequence including occlusions, motion blur, scale and other deformation changes. In this paper, we propose a selective parts-based approach, using correlation filters, that makes choices based on a consensus of the parts and global tracking. Moreover, we further enhance our parts-based approach by introducing a segmentation-assisted parts initialization. In addition, we present a genetic algorithmbased method t...

Dynamic Random Walk for Superpixel Segmentation.

In this paper, we propose a novel random walk model, called Dynamic Random Walk (DRW), which adds a new type of dynamic node to the original RW model and reduces redundant calculation by limiting the walk range. To solve the seed-lacking problem of the proposed DRW, we redefine the energy function of the original RW and use the first arrival probability among each node pair to avoid the interference for each partition. Relaxation of our DRW is performed with the help of a greedy strategy and the Weighted Ra...

Revisiting EmbodiedQA: A Simple Baseline and Beyond.

In Embodied Question Answering (EmbodiedQA), an agent interacts with an environment to gather necessary information for answering user questions. Existing works have laid a solid foundation towards solving this interesting problem. But the current performance, especially in navigation, suggests that EmbodiedQA might be too challenging for the contemporary approaches. In this paper, we empirically study this problem and introduce 1) a simple yet effective baseline that achieves promising performance; 2) an e...

OCTRexpert:A Feature-based 3D Registration Method for Retinal OCT Images.

Medical image registration can be used for studying longitudinal and cross-sectional data, quantitatively monitoring disease progression and guiding computer assisted diagnosis and treatments. However, deformable registration which enables more precise and quantitative comparison has not been well developed for retinal optical coherence tomography (OCT) images. This paper proposes a new 3D registration approach for retinal OCT data called OCTRexpert. To the best of our knowledge, the proposed algorithm is t...

Deep Video Super-Resolution using HR Optical Flow Estimation.

Video super-resolution (SR) aims at generating a sequence of high-resolution (HR) frames with plausible and temporally consistent details from their low-resolution (LR) counterparts. The key challenge for video SR lies in the effective exploitation of temporal dependency between consecutive frames. Existing deep learning based methods commonly estimate optical flows between LR frames to provide temporal dependency. However, the resolution conflict between LR optical flows and HR outputs hinders the recovery...

High Quality Light Field Extraction and Post-Processing for Raw Plenoptic Data.

Light field technology has reached a certain level of maturity in recent years, and its applications in both computer vision research and industry are offering new perspectives for cinematography and virtual reality. Several methods of capture exist, each with its own advantages and drawbacks. One of these methods involves the use of handheld plenoptic cameras. While these cameras offer freedom and ease of use, they also suffer from various visual artefacts and inconsistencies. We propose in this paper an a...

KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment.

Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content and annotating it accurately. We present a systematic and scalable approach to creating KonIQ-10k, the largest IQA dataset to date, consisting of 10,073 quality scored images. It is the first in-the-wild database aiming for ecological validity, concerning the authenticity of distortions, the diversity of cont...

Dynamic Receptive Field Generation for Full-Reference Image Quality Assessment.

Most full-reference image quality assessment (FR-IQA) methods advanced to date have been holistically designed without regard to the type of distortion impairing the image. However, the perception of distortion depends nonlinearly on the distortion type. Here we propose a novel FR-IQA framework that dynamically generates receptive fields responsive to distortion type. Our proposed method-dynamic receptive field generation based image quality assessor (DRF-IQA)-separates the process of FR-IQA into two stream...

Personality-assisted Multi-task Learning for Generic and Personalized Image Aesthetics Assessment.

Traditional image aesthetics assessment (IAA) approaches mainly predict the average aesthetic score of an image. However, people tend to have different tastes on image aesthetics, which is mainly determined by their subjective preferences. As an important subjective trait, personality is believed to be a key factor in modeling individual's subjective preference. In this paper, we present a personality-assisted multi-task deep learning framework for both generic and personalized image aesthetics assessment. ...

SITUP: Scale Invariant Tracking using Average Peak-to-Correlation Energy.

Robust and accurate scale estimation of a target object is a challenging task in visual object tracking. Most existing tracking methods cannot accommodate large scale variation in complex image sequences and thus result in inferior performance. In this paper, we propose to incorporate a novel criterion called the average peak-to-correlation energy into the multi-resolution translation filter framework to obtain robust and accurate scale estimation. The resulting system is named SITUP: Scale Invariant Tracki...

Deepzzle: Solving Visual Jigsaw Puzzles with Deep Learning and Shortest Path Optimization.

We tackle the image reassembly problem with wide space between the fragments, in such a way that the patterns and colors continuity is mostly unusable. The spacing emulates the erosion of which the archaeological fragments suffer. We crop-square the fragments borders to compel our algorithm to learn from the content of the fragments. We also complicate the image reassembly by removing fragments and adding pieces from other sources. We use a two-step method to obtain the reassemblies: 1) a neural network pre...

PWStableNet: Learning Pixel-wise Warping Maps for Video Stabilization.

As the videos captured by hand-held cameras are often perturbed by high-frequency jitters, stabilization of these videos is an essential task. Many video stabilization methods have been proposed to stabilize shaky videos. However, most methods estimate one global homography or several homographies based on fixed meshes to warp the shaky frames into their stabilized views. Due to the existence of parallax, such single or a few homographies can not well handle the depth variation. In contrast to these traditi...

Connecting Image Denoising and High-Level Vision Tasks via Deep Learning.

Image denoising and high-level vision tasks are usually handled independently in the conventional practice of computer vision, and their connection is fragile. In this paper, we cope with the two jointly and explore the mutual influence between them with the focus on two questions, namely (1) how image denoising can help improving high-level vision tasks, and (2) how the semantic information from high-level vision tasks can be used to guide image denoising. First for image denoising we propose a convolution...

A spatio-temporal multi-scale binary descriptor.

Binary descriptors are widely used for multi-view matching and robotic navigation. However, their matching performance decreases considerably under severe scale and viewpoint changes in non-planar scenes. To overcome this problem, we propose to encode the varying appearance of selected 3D scene points tracked by a moving camera with compact spatio-temporal descriptors. To this end, we first track interest points and capture their temporal variations at multiple scales. Then, we validate feature tracks throu...

Learning Hybrid Representation by Robust Dictionary Learning in Factorized Compressed Space.

In this paper, we investigate the robust dictionary learning (DL) to discover the hybrid salient low-rank and sparse representation in a factorized compressed space. A Joint Robust Factorization and Projective Dictionary Learning (J-RFDL) model is presented. The setting of J-RFDL aims at improving the data representations by enhancing the robustness to outliers and noise in data, encoding the reconstruction error more accurately and obtaining hybrid salient coefficients with accurate reconstruction ability....

View-invariant Deep Architecture for Human Action Recognition using Two-stream Motion and Shape Temporal Dynamics.

Human action Recognition for unknown views, is a challenging task. We propose a deep view-invariant human action recognition framework, which is a novel integration of two important action cues: motion and shape temporal dynamics (STD). The motion stream encapsulates the motion content of action as RGB Dynamic Images (RGB-DIs), which are generated by Approximate Rank Pooling (ARP) and processed by using finetuned InceptionV3 model. The STD stream learns long-term view-invariant shape dynamics of action usin...

Material Based Object Tracking in Hyperspectral Videos.

Traditional color images only depict color intensities in red, green and blue channels, often making object trackers fail in challenging scenarios, e.g., background clutter and rapid changes of target appearance. Alternatively, material information of targets contained in large amount of bands of hyperspectral images (HSI) is more robust to these difficult conditions. In this paper, we conduct a comprehensive study on how material information can be utilized to boost object tracking from three aspects: data...

Joint Coding of Local and Global Deep Features in Videos for Visual Search.

Practically, it is more feasible to collect compact visual features rather than the video streams from hundreds of thousands of cameras into the cloud for big data analysis and retrieval. Then the problem becomes which kinds of features should be extracted, compressed and transmitted so as to meet the requirements of various visual tasks. Recently, many studies have indicated that the activations from the convolutional layers in convolutional neural networks (CNNs) can be treated as local deep features desc...

Discriminative Multi-view Privileged Information Learning for Image Re-ranking.

Conventional multi-view re-ranking methods usually perform asymmetrical matching between the region of interest (ROI) in the query image and the whole target image for similarity computation. Due to the inconsistency in the visual appearance, this practice tends to degrade the retrieval accuracy particularly when the image ROI, which is usually interpreted as the image objectness, accounts for a smaller region in the image. Since Privileged Information (PI), which can be viewed as the image prior, is able t...

Semantic Segmentation with Context Encoding and Multi-Path Decoding.

Semantic image segmentation aims to classify every pixel of a scene image to one of many classes. It implicitly involves object recognition, localization, and boundary delineation. In this paper, we propose a segmentation network called CGBNet to enhance the paring results by context encoding and multi-path decoding. We first propose a context encoding module that generates context contrasted local feature to make use of the informative context and the discriminative local information. This context encoding...


Quick Search