Visual stability prediction for robotic manipulation

Li Wenbin, Ales Leonardis, Mario Fritz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)
282 Downloads (Pure)


Understanding physical phenomena is a key competence that enables humans and animals to act and interact under uncertain perception in previously unseen environments containing novel objects and their configurations. Developmental psychology has shown that such skills are acquired by infants from observations at a very early stage. In this paper, we contrast a more traditional approach of taking a model-based route with explicit 3D representations and physical simulation by an end-to-end approach that directly predicts stability from appearance. We ask the question if and to what extent and quality such a skill can directly be acquired in a data-driven way—bypassing the need for an explicit simulation at run-time. We present a learning-based approach based on simulated data that predicts stability of towers comprised of wooden blocks under different conditions and quantities related to the potential fall of the towers. We first evaluate the approach on synthetic data and compared the results to human judgments on the same stimuli. Further, we extend this approach to reason about future states of such towers that in return enables successful stacking.
Original languageEnglish
Title of host publication2017 IEEE International Conference on Robotics and Automation (ICRA)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages8
ISBN (Electronic)9781509046331
ISBN (Print)9781509046348 (PoD)
Publication statusPublished - 24 Jul 2017
Event2017 IEEE International Conference on Robotics and Automation (ICRA 2017) - Singapore
Duration: 29 May 20173 Jun 2017


Conference2017 IEEE International Conference on Robotics and Automation (ICRA 2017)


  • Stability analysis
  • Poles and towers
  • Visualization
  • Predictive models
  • Physics
  • Engines
  • Robots


Dive into the research topics of 'Visual stability prediction for robotic manipulation'. Together they form a unique fingerprint.

Cite this