Item request has been placed!

Item request cannot be made.

Processing Request

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning

Item request has been placed!

Item request cannot be made.

Processing Request

Read More Add to Saved list

Author(s): Yixuan Wang; Simon Zhan; Zhilu Wang; Chao Huang; Zhaoran Wang; Zhuoran Yang; Qi Zhu
Source:
Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2023).
Subject Terms:
FOS: Computer and information sciences; Computer Science - Machine Learning; FOS: Electrical engineering, electronic engineering, information engineering; Systems and Control (eess.SY); Electrical Engineering and Systems Science - Systems and Control; Machine Learning (cs.LG)
Online Access:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bfc58e51528ac9a08bff4a6a18292712
https://doi.org/10.1145/3576841.3585919

Additional Information
- Publication Information:
  ACM, 2023.
- Publication Date:
  2023
- Abstract:
  In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties (e.g., safety, stability) under the learned controller. However, as existing methods typically apply formal verification \emph{after} the controller has been learned, it is sometimes difficult to obtain any certificate, even after many iterations between learning and verification. To address this challenge, we propose a framework that jointly conducts reinforcement learning and formal verification by formulating and solving a novel bilevel optimization problem, which is differentiable by the gradients from the value function and certificates. Experiments on a variety of examples demonstrate the significant advantages of our framework over the model-based stochastic value gradient (SVG) method and the model-free proximal policy optimization (PPO) method in finding feasible controllers with barrier functions and Lyapunov functions that ensure system safety and stability.
  This paper is accepted to International Conference on Cyber-Physical Systems
- Accession Number:
  10.1145/3576841.3585919
- Rights:
  OPEN
- Accession Number:
  edsair.doi.dedup.....bfc58e51528ac9a08bff4a6a18292712

Comments

No Comments.