Abstract
The learning efficiency remains a critical impediment to the practical application of connected and automated vehicles (CAVs). This paper proposes a transferable model-based reinforcement learning (TMBRL) strategy to enhance the sample efficiency and learning rate of CAVs. Specifically, a surrogate policy model is established by capturing state transitions between the actual environment and the vehicle within traffic scenarios. Then, a model-based reinforcement learning (MBRL) approach is established utilizing a surrogate model and a soft actor-critic algorithm. To improve the learning efficiency of platoon control algorithm, a transfer learning method is implemented to MBRL framework. Specifically, the trained surrogate model of vehicles in the source domain is transferred to vehicles in the target domain, and the latter just should update the surrogate model in terms of the individual dynamic characteristics and tasks. Finally, a platoon experiment platform with Prescan software is conducted. The experimental evaluation demonstrates that the TMBRL strategy significantly outperforms conventional reinforcement learning approaches, achieving higher average cumulative reward of 47 and demonstrating a 16% improvement in training success rate. Comparative analysis further reveals that the proposed TMBRL strategy exhibits superior robustness in platoon tracking tasks, maintaining enhanced trajectory tracking precision and stability under dynamic environmental conditions.
| Original language | English |
|---|---|
| Article number | 11424319 |
| Number of pages | 10 |
| Journal | IEEE Transactions on Intelligent Transportation Systems |
| Early online date | 6 Mar 2026 |
| DOIs | |
| Publication status | E-pub ahead of print - 6 Mar 2026 |
Keywords
- Reinforcement learning
- Computational modeling
- Vehicle dynamics
- Transfer learning
- Adaptation models
- Training
- Data models
- Real-time systems
- Predictive models
- Computational efficiency
Fingerprint
Dive into the research topics of 'Transferable Model-Based Reinforcement Learning for Vehicular Platoon Control'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver