Verified probabilistic policies for deep reinforcement learning

Edoardo Bacci, David Parker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Downloads (Pure)

Abstract

Deep reinforcement learning is an increasingly popular technique for synthesising policies to control an agent’s interaction with its environment. There is also growing interest in formally verifying that such policies are correct and execute safely. Progress has been made in this area by building on existing work for verification of deep neural networks and of continuous-state dynamical systems. In this paper, we tackle the problem of verifying probabilistic policies for deep reinforcement learning, which are used to, for example, tackle adversarial environments, break symmetries and manage trade-offs. We propose an abstraction approach, based on interval Markov decision processes, that yields probabilistic guarantees on a policy’s execution, and present techniques to build and solve these models using abstract interpretation, mixed-integer linear programming, entropy-based refinement and probabilistic model checking. We implement our approach and illustrate its effectiveness on a selection of reinforcement learning benchmarks.
Original languageEnglish
Title of host publicationNASA Formal Methods - 14th International Symposium, NFM 2022, Proceedings
Subtitle of host publication14th International Symposium, NFM 2022, Pasadena, CA, USA, May 24–27, 2022, Proceedings
EditorsJyotirmoy V. Deshmukh, Klaus Havelund, Ivan Perez
Place of PublicationCham
PublisherSpringer
Pages193-212
Number of pages20
Volume13260
ISBN (Electronic)978-3-031-06773-0
ISBN (Print)978-3-031-06772-3
DOIs
Publication statusPublished - 20 May 2022
EventNASA Formal Methods 2022 - California Institute of Technology, Pasadena, United States
Duration: 24 May 202227 May 2022
https://nfm2022.caltech.edu/

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13260 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceNASA Formal Methods 2022
Abbreviated titleNFM 2022
Country/TerritoryUnited States
CityPasadena
Period24/05/2227/05/22
Internet address

Bibliographical note

Funding Information:
Acknowledgements. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 834115, FUN2MODEL).

Publisher Copyright:
© 2022, Springer Nature Switzerland AG.

Fingerprint

Dive into the research topics of 'Verified probabilistic policies for deep reinforcement learning'. Together they form a unique fingerprint.

Cite this