TY - GEN
T1 - Towards generic 3D tracking in RGBD videos
T2 - 17th European Conference on Computer Vision, ECCV 2022
AU - Yang, Jinyu
AU - Zhang, Zhongqun
AU - Li, Zhe
AU - Chang, Hyung Jin
AU - Leonardis, Ales
AU - Zheng, Feng
PY - 2022/10/23
Y1 - 2022/10/23
N2 - Tracking in 3D scenes is gaining momentum because of its numerous applications in robotics, autonomous driving, and scene understanding. Currently, 3D tracking is limited to specific model-based approaches involving point clouds, which impedes 3D trackers from applying in natural 3D scenes. RGBD sensors provide a more reasonable and acceptable solution for 3D object tracking due to their readily available synchronised color and depth information. Thus, in this paper, we investigate a novel problem: is it possible to track a generic (class-agnostic) 3D object in RGBD videos and predict 3D bounding boxes of the object of interest? To inspire research on this topic, we newly construct a standard benchmark for generic 3D object tracking, ‘Track-it-in-3D’, which contains 300 RGBD video sequences with dense 3D annotations and corresponding evaluation protocols. Furthermore, we propose an effective tracking baseline to estimate 3D bounding boxes for arbitrary objects in RGBD videos, by fusing appearance and spatial information effectively. Resources are available on https://github.com/yjybuaa/Track-it-in-3D.
AB - Tracking in 3D scenes is gaining momentum because of its numerous applications in robotics, autonomous driving, and scene understanding. Currently, 3D tracking is limited to specific model-based approaches involving point clouds, which impedes 3D trackers from applying in natural 3D scenes. RGBD sensors provide a more reasonable and acceptable solution for 3D object tracking due to their readily available synchronised color and depth information. Thus, in this paper, we investigate a novel problem: is it possible to track a generic (class-agnostic) 3D object in RGBD videos and predict 3D bounding boxes of the object of interest? To inspire research on this topic, we newly construct a standard benchmark for generic 3D object tracking, ‘Track-it-in-3D’, which contains 300 RGBD video sequences with dense 3D annotations and corresponding evaluation protocols. Furthermore, we propose an effective tracking baseline to estimate 3D bounding boxes for arbitrary objects in RGBD videos, by fusing appearance and spatial information effectively. Resources are available on https://github.com/yjybuaa/Track-it-in-3D.
U2 - 10.1007/978-3-031-20047-2_7
DO - 10.1007/978-3-031-20047-2_7
M3 - Conference contribution
SN - 9783031200465
T3 - Lecture Notes in Computer Science
SP - 112
EP - 128
BT - Computer Vision – ECCV 2022
A2 - Avidan, Shai
A2 - Brostow, Gabriel
A2 - Cissé, Moustapha
A2 - Farinella, Giovanni Maria
A2 - Hassner, Tal
PB - Springer
Y2 - 23 October 2022 through 27 October 2022
ER -