Abstract
With the development of depth sensors in recent years, RGBD object tracking has received significant attention. Compared with the traditional RGB object tracking, the addition of the depth modality can effectively solve the target and background interference. However, some existing RGBD trackers use the two modalities separately and thus some particularly useful shared information between them is ignored. On the other hand, some methods attempt to fuse the two modalities by treating them equally, resulting in the missing of modality-specific features. To tackle these limitations, we propose a novel Dual-fused Modality-aware Tracker (termed DMTracker) which aims to learn informative and discriminative representations of the target objects for robust RGBD tracking. The first fusion module focuses on extracting the shared information between modalities based on cross-modal attention. The second aims at integrating the RGB-specific and depth-specific information to enhance the fused features. By fusing both the modality-shared and modality-specific information in a modality-aware scheme, our DMTracker can learn discriminative representations in complex tracking scenes. Experiments show that our proposed tracker achieves very promising results on challenging RGBD benchmarks. Code is available at https://github.com/ShangGaoG/DMTracker.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2022 Workshops, Proceedings |
Editors | Leonid Karlinsky, Tomer Michaeli, Ko Nishino |
Publisher | Springer |
Pages | 478-494 |
Number of pages | 17 |
ISBN (Electronic) | 9783031250859 |
ISBN (Print) | 9783031250842 |
DOIs | |
Publication status | Published - 12 Feb 2023 |
Event | 17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel Duration: 23 Oct 2022 → 27 Oct 2022 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13808 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 17th European Conference on Computer Vision, ECCV 2022 |
---|---|
Abbreviated title | ECCV 2022 |
Country/Territory | Israel |
City | Tel Aviv |
Period | 23/10/22 → 27/10/22 |
Bibliographical note
Funding Information:Acknowledgements. This work is supported by the National Natural Science Foundation of China under Grant No. 61972188 and 62122035.
Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keywords
- Multi-modal learning
- Object tracking
- RGBD tracking
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science