Dynamic multi-level appearance models and adaptive clustered decision trees for single target tracking

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)
210 Downloads (Pure)


This paper presents a tracking algorithm for arbitrary objects in challenging video sequences. Targets are modelled at three different levels of granularity (pixel, parts and bounding box levels), which are cross-constrained to enable robust model relearning. The main contribution is an adaptive clustered decision tree method which dynamically selects the minimum combination of features necessary to sufficiently represent each target part at each frame, thereby providing robustness with computational efficiency. The adaptive clustered decision tree is used in two separate ways: firstly for parts level matching between successive frames; secondly to select the best candidate image regions for learning new parts of the target. We test the tracker using two different tracking benchmarks (VOT2013-2014 and CVPR2013 tracking challenges), based on two different test methodologies, and show it to be more robust than the state-of-the-art methods from both of those tracking challenges, while also offering competitive tracking precision. Additionally, we evaluate the contribution of each key component of the tracker to overall performance; test the sensitivity of the tracker under different initialization conditions; investigate the effect of using features in different orders within the decision trees; illustrate the flexibility of the method for handling arbitrary kinds of features, by showing how it easily extends to handle RGB-D data.
Original languageEnglish
Pages (from-to)169-183
JournalPattern Recognition
Early online date14 Apr 2017
Publication statusPublished - 1 Sept 2017


  • Single target tracking
  • Adaptive clustered decision trees
  • Multi-level appearance models


Dive into the research topics of 'Dynamic multi-level appearance models and adaptive clustered decision trees for single target tracking'. Together they form a unique fingerprint.

Cite this