
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | September 15, 2018 |
Latest Amendment Date: | July 14, 2021 |
Award Number: | 1838833 |
Award Instrument: | Continuing Grant |
Program Manager: |
Darleen Fisher
CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2018 |
End Date: | September 30, 2022 (Estimated) |
Total Intended Award Amount: | $2,000,000.00 |
Total Awarded Amount to Date: | $2,074,977.00 |
Funds Obligated to Date: |
FY 2019 = $408,310.00 FY 2021 = $333,333.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
1608 4TH ST STE 201 BERKELEY CA US 94710-1749 (510)643-3891 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
673 Soda Hall Berkeley CA US 94720-1776 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
Information Technology Researc, Special Projects - CNS, Networking Technology and Syst |
Primary Program Source: |
01001920DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT 01002122DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
This project investigates new ways of structuring and securing information by using cryptographically hardened bundles of data, called DataCapsules. The need for a new approach stems from the proliferation of data-driven technology and cyber-physical systems that control physical devices, such as robots and manufacturing machines, and use information from widely disparate sources. The consequences of data exposure, breach, or corruption can lead to identity theft, property loss, or (increasingly) bodily harm. Unfortunately, common approaches to protecting information are ad-hoc, buggy, and subject to a variety of attacks and failure modes. In contrast, the DataCapsule infrastructure provides a standardized approach to sequencing, securing every bit of information while also including explicit provenance (stating who generated it). DataCapsules may move freely from place to place in the network while retaining their integrity, thereby enabling secure computation at the edge of the network. Further, this project investigates techniques to ease the transition of application writers from current practice to use of the DataCapsule infrastructure. The benefits of standardization around DataCapsules are many fold, including (1) more uniform application of best practices for data security; (2) secure edge computing infrastructures that fluidly interact with authorized entities in the core of the network (cloud); and (3) an opportunity for new networking environments that respect information privacy and security while optimizing for performance and quality of service.
This project explores the use of DataCapsules to improve the security and performance of robotic and machine-learning applications operating in edge computing environments. DataCapsules are secured bundles of information with unique, self-certifying names that are transported over a data-centric 'narrow-waist' infrastructure called the Global Data Plane (GDP). This project investigates the design of DataCapsules as well as an architecture for the GDP that provides flat-address routing from authorized clients to DataCapsules, allowing DataCapsules to be replicated and reside anywhere within the GDP. DataCapsules consist of standardized metadata wrappers anchoring hash-chain-linked histories of transactions labeled by signatures. As universal 'ground-truths' for data storage applications, DataCapsules share some advantages of block-chains, including publicly verifiable integrity. Above the DataCapsule layer, application writers benefit from uniform security while continuing to utilize common storage access patterns, such as filesystems, databases, and key-value stores. The GDP partitions the network into Trust Domains (TDs) to allow clients to reason about the trustworthiness of hardware. This architecture includes overlay switches connected via a tunneling protocol and a scalable location resolution infrastructure. Each TD is responsible for a subset of the DataCapsules and provides data location facilities that serve 'location delegation' certificates (mapping names to network locations) for these DataCapsules. For scalability, this project investigates several name resolution mechanisms, including one based on distributed hash table (DHT) principles. This project also utilizes secure enclave technologies (e.g. Intel SGX) to provide secure computation at the edge of the network. By promoting best practice labeling and secure management of information, the DataCapsule infrastructure promises to lead to an overall reduction in data breaches and safer public and private cyberspace infrastructure. Further, it will allow application writers to trust the security of information at the edge of the network, thus leading to new and better application of data-driven techniques at the network edge while simultaneously improving privacy; this, in turn, will lead to better applications, such as robotic and smart manufacturing. Finally, in addition to educational activities, the project, in collaboration with the University of California Berkeley's Lawrence Hall of Science, will produce open-access videos to raise awareness of information vulnerability and provenance with youth and the public at large.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
In this project, we developed an infrastructure for securely storing, moving, and manipulating information that is embedded in the network. Information is stored in cryptographically hardened bundles of data, called DataCapsules. DataCapsules operate on top of a data-centric network called the Global Data Plane (GDP). Authorized users may interact with their data from anywhere through secure routing over the GDP. The result is a federated storage system that permits owners of data to have full control over the physical location, level of replication, and degree to which third-party service providers may serve the data. See Figure 1.
An important use-case for this infrastructure is what we called "Fog Robotics", namely support for robotic (cyber-physical) activities on the Edge of the network, where infrastructure has less physical security than in the Cloud. Such activities have particularly compelling need to be able to verify the source of all information that they rely upon, to protect the integrity of this information, and to keep sensitive information private.
To support these needs, each bit of data within a DataCapsule is protected through cryptographic signatures, hashing, and encryption. In principle, if all data were placed in DataCapsules, we would always know where data came from (or, similarly, never use fake data by accident). DataCapsules have superficial similarity to blockchains and can reside anywhere: in mobile devices, in Edge infrastructure, or in the Cloud. Figure 2 shows the contents of a typical DataCapsule. Conceptually, DataCapsules are like "shipping containers" on top of the GDP "shipping infrastructure." DataCapsules can be updated with an append-only interface and retain their identity when they are moved through the network.
One of the important outcomes of this project is that we designed multiple prototypes of the GDP infrastructure, the last of which was a high performance overlay network written in the language Rust. Fundamentally, the GDP is a "datacentric" network that allows direct, secure routing of packets to destinations. During the course of the project, we developed the cryptographic underpinnings of the GDP, including mechanisms for data owners to selectively delegate permission to serve their data to service providers of their choice. Figure 3 shows components of the GDP.
While DataCapsules can be utilized directly by applications, the challenge to application writers is two-fold:
(1) The DataCapsule interface is optimized for security, privacy, and mobility, rather than usability. Application writers would ideally prefer to use more normal storage "patterns" such as key-value stores, file systems, and databases to interact with their data.
(2) Application writers would like to be able to launch their computations into Trusted Execution Environments (TEEs) running within the infrastructure (e.g. the cloud or edge) to protect the contents and integrity of data at all times. Such execution is tricky in general.
One of the game-changing outcomes of this project was the introduction of the idea of Paranoid Stateful Lambdas (PSLs). The PSL model extends the so-called "serverless" execution model by (1) making it possible to easily launch parallel computations into TEEs in the infrastructure and (2) providing stateful execution through a secure key-value store backed by DataCapsules.
We produced a PSL prototype, shown in Figure 4, that provides secure, attested execution of client code in a parallel environment with an eventually-consistent key-value storage model and release-consistent locking. This prototype makes use of a new certification technology, developed in collaboration with VMWare, that makes it easy to work with multiple trusted computing architectures. As part PSL, we implemented a parallel key-value store that we call "CapsuleDB". Figure 5 shows a version of the data format utilized by our "CapsuleDB" prototype. CapsuleDB stores its data in replicated DataCapsules, allowing it to recover from crashes and network partitions.
Another important outcome of this project is our interaction with the robotics community. This interaction/collaboration produced multiple versions of FogROS -- a plugin to the popular Robot Operating System (ROS/ROSv2). Through various versions of FogROS, we have made it easy for ROS users to build geographically distributed robotic systems (i.e. with components spanning multiple Edge environments and the Cloud). Figure 6 illustrates this idea, in which every robot and network service has a unique presence on the GDP represented by a securely-generated 256-bit destination name. Significantly, the second prototype, FogROS2, was accepted as part of the main-line release of ROSv2 -- thus ensuring wide dissemmination of the artifacts from this project.
Finally, the outreach portion of the project involved a very successful collaboration with the Lawrence Hall of Science to develop a new animated musical series for kids called "Tuff Pupil". This series is in the style of School-house Rock and is about global STEM Issues. The episodes feature music by the Grammy-winning Alphabet Rockers, with the first three episodes discuss data provenance, privacy, and protection. The series has gotten very good feedback additional episodes are in the planning stages.
Last Modified: 02/23/2023
Modified by: John D Kubiatowicz
Please report errors in award information by writing to: awardsearch@nsf.gov.