Robust action recognition using local motion and group sparsity

Jungchan Cho, Minsik Lee, Hyung Jin Chang, Songhwai Oh

Research output: Contribution to journalArticlepeer-review

51 Citations (Scopus)

Abstract

Recognizing actions in a video is a critical step for making many vision-based applications possible and has attracted much attention recently. However, action recognition in a video is a challenging task due to wide variations within an action, camera motion, cluttered background, and occlusions, to name a few. While dense sampling based approaches are currently achieving the state-of-the-art performance in action recognition, they do not perform well for many realistic video sequences since, by considering every motion found in a video equally, the discriminative power of these approaches is often reduced due to clutter motions, such as background changes and camera motions. In this paper, we robustly identify local motions of interest in an unsupervised manner by taking advantage of group sparsity. In order to robustly classify action types, we emphasize local motion by combining local motion descriptors and full motion descriptors and apply group sparsity to the emphasized motion features using the multiple kernel method. In experiments, we show that different types of actions can be well recognized using a small number of selected local motion descriptors and the proposed algorithm achieves the state-of-the-art performance on popular benchmark datasets, outperforming existing methods. We also demonstrate that the group sparse representation with the multiple kernel method can dramatically improve the action recognition performance.
Original languageEnglish
Pages (from-to)1813-1825
Number of pages12
JournalPattern Recognition
Volume47
Issue number5
Early online date16 Dec 2013
DOIs
Publication statusPublished - May 2014

Keywords

  • Action recognition
  • Motion descriptor
  • Sparse representation
  • Dynamic scene understanding

Fingerprint

Dive into the research topics of 'Robust action recognition using local motion and group sparsity'. Together they form a unique fingerprint.

Cite this