Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing

Research output: Chapter in Book/Report/Conference proceedingConference contribution

236 Downloads (Pure)


This paper presents a novel method for single target tracking in RGB images under conditions of extreme clutter and camouflage, including frequent occlusions by objects with similar appearance as the target. In contrast to conventional single target trackers, which only maintain the estimated target status, we propose a multi-level clustering-based robust estimation for online detection and learning of multiple target-like regions, called distractors, when they appear near to the true target. To distinguish the target from these distractors, we exploit a global dynamic constraint (derived from the target and the distractors) in a feedback loop to improve single target tracking performance in situations where the target is camouflaged in highly cluttered scenes. Our proposed method successfully prevents the estimated target location from erroneously jumping to a distractor during occlusion or extreme camouflage interactions. To gain an insightful understanding of the evaluated trackers, we have augmented publicly available benchmark videos, by proposing a new set of clutter and camouflage sub-attributes, and annotating these sub-attributes for all frames in all sequences. Using this dataset, we first evaluate the effect of each key component of the tracker on the overall performance. Then, the proposed tracker is compared to other highly ranked single target tracking algorithms in the literature. The experimental results show that applying the proposed global dynamic constraint in a feedback loop can improve single target tracker performance, and demonstrate that the overall algorithm significantly outperforms other state-of-the-art single target trackers in highly cluttered scenes.
Original languageEnglish
Title of host publicationComputer Vision – ECCV 2016 Workshops Part III
EditorsGang Hua, Hervé Jégou
ISBN (Electronic)978-3-319-49409-8
ISBN (Print)978-3-319-49408-1
Publication statusPublished - 24 Nov 2016
Event14th European Conference on Computer Vision (ECCV 2016) -
Duration: 15 Oct 201616 Oct 2016

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743


Conference14th European Conference on Computer Vision (ECCV 2016)


Dive into the research topics of 'Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing'. Together they form a unique fingerprint.

Cite this