Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing

Hector Basevi; Ales Leonardis

doi:10.1007/978-3-319-49409-8

Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing

Hector Basevi, Ales Leonardis

Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

230 Downloads (Pure)

Abstract

This paper presents a novel method for single target tracking in RGB images under conditions of extreme clutter and camouflage, including frequent occlusions by objects with similar appearance as the target. In contrast to conventional single target trackers, which only maintain the estimated target status, we propose a multi-level clustering-based robust estimation for online detection and learning of multiple target-like regions, called distractors, when they appear near to the true target. To distinguish the target from these distractors, we exploit a global dynamic constraint (derived from the target and the distractors) in a feedback loop to improve single target tracking performance in situations where the target is camouflaged in highly cluttered scenes. Our proposed method successfully prevents the estimated target location from erroneously jumping to a distractor during occlusion or extreme camouflage interactions. To gain an insightful understanding of the evaluated trackers, we have augmented publicly available benchmark videos, by proposing a new set of clutter and camouflage sub-attributes, and annotating these sub-attributes for all frames in all sequences. Using this dataset, we first evaluate the effect of each key component of the tracker on the overall performance. Then, the proposed tracker is compared to other highly ranked single target tracking algorithms in the literature. The experimental results show that applying the proposed global dynamic constraint in a feedback loop can improve single target tracker performance, and demonstrate that the overall algorithm significantly outperforms other state-of-the-art single target trackers in highly cluttered scenes.

Original language	English
Title of host publication	Computer Vision – ECCV 2016 Workshops Part III
Editors	Gang Hua, Hervé Jégou
Publisher	Springer
Pages	665-681
Volume	9915
ISBN (Electronic)	978-3-319-49409-8
ISBN (Print)	978-3-319-49408-1
DOIs	https://doi.org/10.1007/978-3-319-49409-8
Publication status	Published - 24 Nov 2016
Event	14th European Conference on Computer Vision (ECCV 2016) - Duration: 15 Oct 2016 → 16 Oct 2016

Publication series

Name	Lecture Notes in Computer Science
Publisher	Springer
Volume	9915
ISSN (Print)	0302-9743

Conference

Conference	14th European Conference on Computer Vision (ECCV 2016)
Period	15/10/16 → 16/10/16

Access to Document

10.1007/978-3-319-49409-8Licence: None: All rights reserved

TowardsCategorisation
Checked for eligibility: 29/08/2017
Accepted author manuscript, 2.19 MB

http://www.springer.com/gb/book/9783319494081Licence: None: All rights reserved

Cite this

Basevi, H., & Leonardis, A. (2016). Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing. In G. Hua, & H. Jégou (Eds.), Computer Vision – ECCV 2016 Workshops Part III (Vol. 9915, pp. 665-681). (Lecture Notes in Computer Science; Vol. 9915). Springer. https://doi.org/10.1007/978-3-319-49409-8

@inproceedings{ce86b84f9ad64c5cb1ce88e8a7a158ad,

title = "Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing",

abstract = "This paper presents a novel method for single target tracking in RGB images under conditions of extreme clutter and camouflage, including frequent occlusions by objects with similar appearance as the target. In contrast to conventional single target trackers, which only maintain the estimated target status, we propose a multi-level clustering-based robust estimation for online detection and learning of multiple target-like regions, called distractors, when they appear near to the true target. To distinguish the target from these distractors, we exploit a global dynamic constraint (derived from the target and the distractors) in a feedback loop to improve single target tracking performance in situations where the target is camouflaged in highly cluttered scenes. Our proposed method successfully prevents the estimated target location from erroneously jumping to a distractor during occlusion or extreme camouflage interactions. To gain an insightful understanding of the evaluated trackers, we have augmented publicly available benchmark videos, by proposing a new set of clutter and camouflage sub-attributes, and annotating these sub-attributes for all frames in all sequences. Using this dataset, we first evaluate the effect of each key component of the tracker on the overall performance. Then, the proposed tracker is compared to other highly ranked single target tracking algorithms in the literature. The experimental results show that applying the proposed global dynamic constraint in a feedback loop can improve single target tracker performance, and demonstrate that the overall algorithm significantly outperforms other state-of-the-art single target trackers in highly cluttered scenes. ",

author = "Hector Basevi and Ales Leonardis",

year = "2016",

month = nov,

day = "24",

doi = "10.1007/978-3-319-49409-8",

language = "English",

isbn = "978-3-319-49408-1",

volume = "9915",

series = "Lecture Notes in Computer Science",

publisher = "Springer",

pages = "665--681",

editor = "Gang Hua and { J{\'e}gou}, Herv{\'e}",

booktitle = "Computer Vision – ECCV 2016 Workshops Part III",

note = "14th European Conference on Computer Vision (ECCV 2016) ; Conference date: 15-10-2016 Through 16-10-2016",

}

Basevi, H & Leonardis, A 2016, Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing. in G Hua & H Jégou (eds), Computer Vision – ECCV 2016 Workshops Part III. vol. 9915, Lecture Notes in Computer Science, vol. 9915, Springer, pp. 665-681, 14th European Conference on Computer Vision (ECCV 2016), 15/10/16. https://doi.org/10.1007/978-3-319-49409-8

Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing. / Basevi, Hector ; Leonardis, Ales.
Computer Vision – ECCV 2016 Workshops Part III. ed. / Gang Hua; Hervé Jégou. Vol. 9915 Springer, 2016. p. 665-681 (Lecture Notes in Computer Science; Vol. 9915).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing

AU - Basevi, Hector

AU - Leonardis, Ales

PY - 2016/11/24

Y1 - 2016/11/24

N2 - This paper presents a novel method for single target tracking in RGB images under conditions of extreme clutter and camouflage, including frequent occlusions by objects with similar appearance as the target. In contrast to conventional single target trackers, which only maintain the estimated target status, we propose a multi-level clustering-based robust estimation for online detection and learning of multiple target-like regions, called distractors, when they appear near to the true target. To distinguish the target from these distractors, we exploit a global dynamic constraint (derived from the target and the distractors) in a feedback loop to improve single target tracking performance in situations where the target is camouflaged in highly cluttered scenes. Our proposed method successfully prevents the estimated target location from erroneously jumping to a distractor during occlusion or extreme camouflage interactions. To gain an insightful understanding of the evaluated trackers, we have augmented publicly available benchmark videos, by proposing a new set of clutter and camouflage sub-attributes, and annotating these sub-attributes for all frames in all sequences. Using this dataset, we first evaluate the effect of each key component of the tracker on the overall performance. Then, the proposed tracker is compared to other highly ranked single target tracking algorithms in the literature. The experimental results show that applying the proposed global dynamic constraint in a feedback loop can improve single target tracker performance, and demonstrate that the overall algorithm significantly outperforms other state-of-the-art single target trackers in highly cluttered scenes.

AB - This paper presents a novel method for single target tracking in RGB images under conditions of extreme clutter and camouflage, including frequent occlusions by objects with similar appearance as the target. In contrast to conventional single target trackers, which only maintain the estimated target status, we propose a multi-level clustering-based robust estimation for online detection and learning of multiple target-like regions, called distractors, when they appear near to the true target. To distinguish the target from these distractors, we exploit a global dynamic constraint (derived from the target and the distractors) in a feedback loop to improve single target tracking performance in situations where the target is camouflaged in highly cluttered scenes. Our proposed method successfully prevents the estimated target location from erroneously jumping to a distractor during occlusion or extreme camouflage interactions. To gain an insightful understanding of the evaluated trackers, we have augmented publicly available benchmark videos, by proposing a new set of clutter and camouflage sub-attributes, and annotating these sub-attributes for all frames in all sequences. Using this dataset, we first evaluate the effect of each key component of the tracker on the overall performance. Then, the proposed tracker is compared to other highly ranked single target tracking algorithms in the literature. The experimental results show that applying the proposed global dynamic constraint in a feedback loop can improve single target tracker performance, and demonstrate that the overall algorithm significantly outperforms other state-of-the-art single target trackers in highly cluttered scenes.

U2 - 10.1007/978-3-319-49409-8

DO - 10.1007/978-3-319-49409-8

M3 - Conference contribution

SN - 978-3-319-49408-1

VL - 9915

T3 - Lecture Notes in Computer Science

SP - 665

EP - 681

BT - Computer Vision – ECCV 2016 Workshops Part III

A2 - Hua, Gang

A2 - Jégou, Hervé

PB - Springer

T2 - 14th European Conference on Computer Vision (ECCV 2016)

Y2 - 15 October 2016 through 16 October 2016

ER -

Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing

Abstract

Publication series

Conference

Access to Document

Fingerprint

Cite this