
NSF Org: |
ECCS Division of Electrical, Communications and Cyber Systems |
Recipient: |
|
Initial Amendment Date: | August 29, 2018 |
Latest Amendment Date: | July 20, 2021 |
Award Number: | 1836819 |
Award Instrument: | Standard Grant |
Program Manager: |
Richard Nash
rnash@nsf.gov (703)292-5394 ECCS Division of Electrical, Communications and Cyber Systems ENG Directorate for Engineering |
Start Date: | September 15, 2018 |
End Date: | August 31, 2022 (Estimated) |
Total Intended Award Amount: | $666,254.00 |
Total Awarded Amount to Date: | $682,254.00 |
Funds Obligated to Date: |
FY 2021 = $16,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
4333 BROOKLYN AVE NE SEATTLE WA US 98195-1016 (206)543-4043 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
4333 Brooklyn Ave NE Seattle WA US 98195-0001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
EPCN-Energy-Power-Ctrl-Netwrks, CPS-Cyber-Physical Systems |
Primary Program Source: |
01002122DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.041 |
ABSTRACT
We propose to generalize and certify the performance of reinforcement learning algorithms for control of cyber-physical systems (CPS). Broadly speaking, reinforcement learning applied to physical systems is concerned with making predictions from data to control the system to extremize a performance criterion. The project will particularly focus on developing theory and algorithms applicable to hybrid and multi-agent control systems, that is, systems with continuous and discrete elements and systems with multiple decision-making agents, which are ubiquitous in CPS across spatiotemporal scales and application domains. Reinforcement learning algorithms are not yet mature enough to guarantee performance when applied to control of CPS. In light of these limitations, this project aims to lay the theoretical and computational foundation to certify reinforcement learning algorithms so that they may be deployed in society with high confidence.
This project will certify reinforcement learning algorithms that compute optimal control policies in systems with non-classical dynamics and non-classical costs. To achieve this goal, we will generalize convergent algorithms originally designed for purely continuous systems to apply in hybrid control systems whose states undergo a mixture of discrete and continuous transitions. Moreover, we specifically aim to ensure this approach is applicable to societal-scale CPS in which multiple agents, some of which may be humans, interact directly with the CPS. These algorithms will be experimentally validated on three testbeds that represent a range of hybrid and multi-agent phenomena that arise in CPS. The first testbed will test the performance of our algorithms on societal-scale traffic flow networks via simulation. The second testbed will consider heterogeneous teams of aerial and terrestrial mobile robots collaborating with human partners to perform construction, inspection, and maintenance tasks on scale facsimiles of infrastructure like bridges, and tunnels. The third testbed will study the closed-loop interaction between individual humans and remote, teleoperated robots that perform dynamic locomotion and manipulation behaviors. This project will also co-organize an interdisciplinary workshop with technology policy experts, the results of which will form the basis for an interdisciplinary multi-campus graduate-level seminar run by the PIs.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
This project aimed to generalize and certify the performance of reinforcementlearning algorithms for control of cyber-physical systems (CPS). In particular,outcomes of the project including new theory, metrics, and methods for certifying algorithms that compute optimal control policies in systems with non-classical dynamics and non-classical costs. Broadly speaking, reinforcement learning applied to physical systems is concerned with making predictions from data to control the system to extremize a performance criterion. In this context, this project particularly focused on developing theory and algorithms applicable to hybrid and multi-agent control systems, that is, systems with continuous and discrete elements and systems with multiple decision-making agents, which are ubiquitous in CPS acrosss patio-temporal scales and application domains. While arguably reinforcement learning algorithms are not yet mature enough to guarantee performance when applied to control of complex CPS with learning enable components, this project made great strides in laying the theoretical and computational foundation to certify reinforcement learning algorithms so that they may be deployed in society with high confidence.
To achieve this goal, the research team generalized convergent algorithms originally designed for purely continuous systems to apply in hybrid controlsystems whose states undergo a mixture of discrete and continuous transitions. Computationally efficient methods for ensuring additional desiderata such as saftey were developed. Moreover, the team developed techniques that targeted societal-scale CPS inwhich multiple agents, some of which may be humans, interact directly with the CPS. The developed algorithms and theoretical models were experimentally validated on three testbeds that represent a range of hybrid and multi-agent phenomena arising in CPS. The first testbed focuses on the performance of ouralgorithms on societal-scale traffic flow networks via simulation. The second testbed concerns heterogeneous teams of aerial and terrestrial mobile robots collaborating with human partners to perform construction, inspection, and maintenance tasks on scale facsimiles of infrastructure like bridges, andtunnels. The third testbed comprises both simulation and real-world experimental studies focused on the closed-loop interaction between individual humans and learning based systems. The latter test bed is motivated by CPS suchas teleoperated robots that perform dynamic locomotion and manipulation behaviors, human-in-the-loop societal scale CPS such as intelligent transportation, and multi-agent reinforcement learning broadly. Overall the project resulted in a wide range of results from fundamental new theory on non-smooth dynamical systems and multi-agent interactions to practically implementable algorithms to experimental validation in a variety of CPS context.
The research team included not just the PIs and graduate students, but a numberof undergraduate researchers funded through REUs who contributed significantlyto the project, especially the simulation-based test beds which are now publiclyavailable for the community to use. Finally, the team disseminated the results to a range of different communities including industry, policy makers, governing bodies, and academia.
Last Modified: 01/14/2023
Modified by: Lillian Ratliff
Please report errors in award information by writing to: awardsearch@nsf.gov.