Boosting Performance of Visual Servoing Using Deep Reinforcement Learning From Multiple Demonstrations

Ali Aflakian, Alireza Rastegharpanah*, Rustam Stolkin

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Downloads (Pure)

Abstract

In this study, knowledge of multiple controllers was used and combined with deep reinforcement learning (RL) to train a visual servoing (VS) technique. Deep RL algorithms were successful in solving complicated control problems, however they generally require a large amount of data before they achieve an acceptable performance. We developed a method that generates online hyper-volume action bounds from demonstrations of multiple controllers (experts) to address the issue of insufficient data in RL. The agent then continues to explore the created bounds to find more optimized solutions and gain more rewards. By doing this, we cut out pointless agent explorations, which results in a reduction in training time as well as an improvement in performance of the trained policy. During the training process, we used domain randomization and domain adaptation to make the VS approach robust in the real world. As a result, we showed a 51% decrease in training time to achieve the desired level of performance, compared to the case when RL was used solely. The findings showed that the developed method outperformed other baseline VS methods (image-based VS, position-based VS, and hybrid-decoupled VS) in terms of VS error convergence speed and maintained higher manipulability.
Original languageEnglish
Pages (from-to)26512-26520
JournalIEEE Access
Volume11
Early online date13 Mar 2023
DOIs
Publication statusPublished - 21 Mar 2023

Keywords

  • Visual servoing
  • reinforcement learning
  • online action bounding
  • reinforcement learning from demonstrations
  • manipulability

Fingerprint

Dive into the research topics of 'Boosting Performance of Visual Servoing Using Deep Reinforcement Learning From Multiple Demonstrations'. Together they form a unique fingerprint.

Cite this