Learning Dual-Fused Modality-Aware Representations for RGBD Tracking

Shang Gao, Jinyu Yang, Zhe Li, Feng Zheng*, Aleš Leonardis, Jingkuan Song

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the development of depth sensors in recent years, RGBD object tracking has received significant attention. Compared with the traditional RGB object tracking, the addition of the depth modality can effectively solve the target and background interference. However, some existing RGBD trackers use the two modalities separately and thus some particularly useful shared information between them is ignored. On the other hand, some methods attempt to fuse the two modalities by treating them equally, resulting in the missing of modality-specific features. To tackle these limitations, we propose a novel Dual-fused Modality-aware Tracker (termed DMTracker) which aims to learn informative and discriminative representations of the target objects for robust RGBD tracking. The first fusion module focuses on extracting the shared information between modalities based on cross-modal attention. The second aims at integrating the RGB-specific and depth-specific information to enhance the fused features. By fusing both the modality-shared and modality-specific information in a modality-aware scheme, our DMTracker can learn discriminative representations in complex tracking scenes. Experiments show that our proposed tracker achieves very promising results on challenging RGBD benchmarks. Code is available at https://github.com/ShangGaoG/DMTracker.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2022 Workshops, Proceedings
EditorsLeonid Karlinsky, Tomer Michaeli, Ko Nishino
PublisherSpringer
Pages478-494
Number of pages17
ISBN (Electronic)9783031250859
ISBN (Print)9783031250842
DOIs
Publication statusPublished - 12 Feb 2023
Event17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel
Duration: 23 Oct 202227 Oct 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13808 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th European Conference on Computer Vision, ECCV 2022
Country/TerritoryIsrael
CityTel Aviv
Period23/10/2227/10/22

Bibliographical note

Funding Information:
Acknowledgements. This work is supported by the National Natural Science Foundation of China under Grant No. 61972188 and 62122035.

Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Keywords

  • Multi-modal learning
  • Object tracking
  • RGBD tracking

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Learning Dual-Fused Modality-Aware Representations for RGBD Tracking'. Together they form a unique fingerprint.

Cite this