Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images

Tze Ho Elden Tse*, Franziska Mueller, Zhengyang Shen, Danhang Tang, Thabo Beeler, Mingsong Dou, Yinda Zhang, Sasa Petrovic, Hyung Jin Chang, Jonathan Taylor, Bardia Doosti

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

58 Downloads (Pure)

Abstract

We propose a novel transformer-based framework that reconstructs two high fidelity hands from multi-view RGB images. Unlike existing hand pose estimation methods, where one typically trains a deep network to regress hand model parameters from single RGB image, we consider a more challenging problem setting where we directly regress the absolute root poses of two-hands with extended forearm at high resolution from egocentric view. As existing datasets are either infeasible for egocentric viewpoints or lack background variations, we create a large-scale synthetic dataset with diverse scenarios and collect a real dataset from multi-calibrated camera setup to verify our proposed multi-view image feature fusion strategy. To make the reconstruction physically plausible, we propose two strategies: (i) a coarse-to-fine spectral graph convolution decoder to smoothen the meshes during upsampling and (ii) an optimisation-based refinement stage at inference to prevent self-penetrations. Through extensive quantitative and qualitative evaluations, we show that our framework is able to produce realistic two-hand reconstructions and demonstrate the generalisation of synthetic-trained models to real data, as well as real-time AR/VR applications.
Original languageEnglish
Title of host publication2023 IEEE/CVF International Conference on Computer Vision (ICCV)
PublisherIEEE
Pages14620-14631
Number of pages12
ISBN (Electronic)9798350307184
ISBN (Print)9798350307191 (PoD)
DOIs
Publication statusPublished - 15 Jan 2024
Event2023 International Conference on Computer Vision - Paris Convention Centre, Paris, France
Duration: 2 Oct 20236 Oct 2023

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
PublisherIEEE
ISSN (Print)1550-5499
ISSN (Electronic)2380-7504

Conference

Conference2023 International Conference on Computer Vision
Abbreviated titleICCV 2023
Country/TerritoryFrance
CityParis
Period2/10/236/10/23

Bibliographical note

Acknowledgments:
This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2023-2020-0-01789), supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation).

Keywords

  • Laplace equations
  • Image resolution
  • Pose estimation
  • Transformers
  • Robustness
  • Real-time systems
  • Graph theory

Fingerprint

Dive into the research topics of 'Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images'. Together they form a unique fingerprint.

Cite this