CMU-CS-18-125
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-18-125

Verifiably Safe Autonomy for Cyber-Physical Systems

Nathan Fulton

Ph.D. Thesis

November 2018

CMU-CS-18-125.pdf


Keywords: Cyber-Physical Systems, Hybrid Systems, Autonomous Systems, Formal Verification, Differential Dynamic Logic, Automated Theorem Proving, Reinforcement Learning

This thesis demonstrates that autonomous cyber-physical systems that use machine learning for control are amenable to formal verification.

Cyber-physical systems, such as autonomous vehicles and medical devices, are increasingly common and increasingly autonomous. Designing safe cyber-physical systems is difficult because of the interaction between the discrete dynamics of control software and the continuous dynamics of the vehicle's physical movement. Designing safe autonomous cyber-physical systems is even more difficult because of the interaction between classical controls software and machine learning components. Formal methods capable of reasoning about these hybrid discrete-continuous dynamics can help engineers obtain strong safety guarantees about safety-critical control systems. Several recent successes in applying formal methods to hybrid dynamical systems demonstrate that these tools provide a promising foundation for establishing safety properties about planes, trains, and cars. However, existing theory and tooling does not explain how to obtain formal safety guarantees for systems that use reinforcement learning to discover efficient control policies from data. This gap in existing knowledge is important because modern approaches toward building cyberphysical systems combine machine learning with classical controls engineering to navigate in open environments.

This thesis introduces KeYmaera X, a theorem prover for hybrid systems, and uses KeYmaera X to obtain verified safety guarantees for control policies generated by reinforcement learning algorithms. These contributions enable strong safety guarantees for optimized control policies when the underlying environment matches a first-principles model.

This thesis also introduces an approach toward providing safety guarantees for learned control policies even when reality deviates from modeling assumptions. The core technical contribution is a new class of algorithms that blend learning and reasoning to update models in response to newly observed dynamics in the environment. When models are updated online, we leverage verification results about the original but incorrect model to ensure that there is a systematic relationship between the optimization objective and desired safety properties. When models are updated offline, formal verification results are preserved and explainable environmental models are synthesized. These contributions provide verifiable safety guarantees for systems that are controlled by policies obtained through reinforcement learning, justifying the use of reinforcement learning in safety-critical settings.

Thesis Committee:
André Platzer (Chair)
Jeremy Avigad
Goran Frehse (ENSTA ParisTech)
Zico Koltera
Stefan Mitsch

Srinivasan Seshan, Head, Computer Science Department
Tom. M. Mitchell, Interim Dean, School of Computer Science


152 pages



Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by [email protected]