Verifying reinforcement learning up to infinity

Edoardo Bacci; Mirco Giacobbe; David Parker

doi:10.24963/ijcai.2021/297

Verifying reinforcement learning up to infinity

Edoardo Bacci, Mirco Giacobbe, David Parker

Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

193 Downloads (Pure)

Abstract

Formally verifying that reinforcement learning systems act safely is increasingly important, but existing methods only verify over finite time. This is of limited use for dynamical systems that run indefinitely. We introduce the first method for verifying the time-unbounded safety of neural networks controlling dynamical systems. We develop a novel abstract interpretation method which, by constructing adaptable template-based polyhedra using MILP and interval arithmetic, yields sound---safe and invariant---overapproximations of the reach set. This provides stronger safety guarantees than previous time-bounded methods and shows whether the agent has generalised beyond the length of its training episodes. Our method supports ReLU activation functions and systems with linear, piecewise linear and non-linear dynamics defined with polynomial and transcendental functions. We demonstrate its efficacy on a range of benchmark control problems.

Original language	English
Title of host publication	Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Subtitle of host publication	Montreal, 19-27 August 2021
Editors	Zhi-Hua Zhou
Publisher	International Joint Conferences on Artificial Intelligence Organization (IJCAI)
Pages	2154-2160
Number of pages	7
ISBN (Electronic)	9780999241196
DOIs	https://doi.org/10.24963/ijcai.2021/297
Publication status	Published - 27 Aug 2021
Event	30th International Joint Conference on Artificial Intelligence (IJCAI-21) - Duration: 21 Aug 2021 → 26 Aug 2021

Conference

Conference	30th International Joint Conference on Artificial Intelligence (IJCAI-21)
Abbreviated title	IJCAI-21
Period	21/08/21 → 26/08/21

Keywords

Machine Learning: Deep Reinforcement Learning
Multidisciplinary Topics and Applications: Validation and Verification
Robotics: Learning in Robotics

Access to Document

10.24963/ijcai.2021/297Licence: Other (please provide link to licence statement

BacciE2021Verifying
Copyright © 2021, IJCAI
Accepted author manuscript, 1.1 MBLicence: None: All rights reserved

Cite this

@inproceedings{2b0675b2f13b4025aeffedd83fe83146,

title = "Verifying reinforcement learning up to infinity",

abstract = "Formally verifying that reinforcement learning systems act safely is increasingly important, but existing methods only verify over finite time. This is of limited use for dynamical systems that run indefinitely. We introduce the first method for verifying the time-unbounded safety of neural networks controlling dynamical systems. We develop a novel abstract interpretation method which, by constructing adaptable template-based polyhedra using MILP and interval arithmetic, yields sound---safe and invariant---overapproximations of the reach set. This provides stronger safety guarantees than previous time-bounded methods and shows whether the agent has generalised beyond the length of its training episodes. Our method supports ReLU activation functions and systems with linear, piecewise linear and non-linear dynamics defined with polynomial and transcendental functions. We demonstrate its efficacy on a range of benchmark control problems.",

keywords = "Machine Learning: Deep Reinforcement Learning, Multidisciplinary Topics and Applications: Validation and Verification, Robotics: Learning in Robotics",

author = "Edoardo Bacci and Mirco Giacobbe and David Parker",

year = "2021",

month = aug,

day = "27",

doi = "10.24963/ijcai.2021/297",

language = "English",

pages = "2154--2160",

editor = "Zhi-Hua Zhou",

booktitle = "Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence",

publisher = "International Joint Conferences on Artificial Intelligence Organization (IJCAI)",

note = "30th International Joint Conference on Artificial Intelligence (IJCAI-21), IJCAI-21 ; Conference date: 21-08-2021 Through 26-08-2021",

}

Bacci, E, Giacobbe, M & Parker, D 2021, Verifying reinforcement learning up to infinity. in Z-H Zhou (ed.), Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence: Montreal, 19-27 August 2021. International Joint Conferences on Artificial Intelligence Organization (IJCAI), pp. 2154-2160, 30th International Joint Conference on Artificial Intelligence (IJCAI-21), 21/08/21. https://doi.org/10.24963/ijcai.2021/297

Verifying reinforcement learning up to infinity. / Bacci, Edoardo; Giacobbe, Mirco ; Parker, David.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence: Montreal, 19-27 August 2021. ed. / Zhi-Hua Zhou. International Joint Conferences on Artificial Intelligence Organization (IJCAI), 2021. p. 2154-2160.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Verifying reinforcement learning up to infinity

AU - Bacci, Edoardo

AU - Giacobbe, Mirco

AU - Parker, David

PY - 2021/8/27

Y1 - 2021/8/27

N2 - Formally verifying that reinforcement learning systems act safely is increasingly important, but existing methods only verify over finite time. This is of limited use for dynamical systems that run indefinitely. We introduce the first method for verifying the time-unbounded safety of neural networks controlling dynamical systems. We develop a novel abstract interpretation method which, by constructing adaptable template-based polyhedra using MILP and interval arithmetic, yields sound---safe and invariant---overapproximations of the reach set. This provides stronger safety guarantees than previous time-bounded methods and shows whether the agent has generalised beyond the length of its training episodes. Our method supports ReLU activation functions and systems with linear, piecewise linear and non-linear dynamics defined with polynomial and transcendental functions. We demonstrate its efficacy on a range of benchmark control problems.

AB - Formally verifying that reinforcement learning systems act safely is increasingly important, but existing methods only verify over finite time. This is of limited use for dynamical systems that run indefinitely. We introduce the first method for verifying the time-unbounded safety of neural networks controlling dynamical systems. We develop a novel abstract interpretation method which, by constructing adaptable template-based polyhedra using MILP and interval arithmetic, yields sound---safe and invariant---overapproximations of the reach set. This provides stronger safety guarantees than previous time-bounded methods and shows whether the agent has generalised beyond the length of its training episodes. Our method supports ReLU activation functions and systems with linear, piecewise linear and non-linear dynamics defined with polynomial and transcendental functions. We demonstrate its efficacy on a range of benchmark control problems.

KW - Machine Learning: Deep Reinforcement Learning

KW - Multidisciplinary Topics and Applications: Validation and Verification

KW - Robotics: Learning in Robotics

UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85123421535&partnerID=MN8TOARS

U2 - 10.24963/ijcai.2021/297

DO - 10.24963/ijcai.2021/297

M3 - Conference contribution

SP - 2154

EP - 2160

BT - Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence

A2 - Zhou, Zhi-Hua

PB - International Joint Conferences on Artificial Intelligence Organization (IJCAI)

T2 - 30th International Joint Conference on Artificial Intelligence (IJCAI-21)

Y2 - 21 August 2021 through 26 August 2021

ER -

Verifying reinforcement learning up to infinity

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this