
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | March 26, 2018 |
Latest Amendment Date: | April 13, 2022 |
Award Number: | 1750024 |
Award Instrument: | Continuing Grant |
Program Manager: |
Phillip Regalia
pregalia@nsf.gov (703)292-2981 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | April 1, 2018 |
End Date: | March 31, 2024 (Estimated) |
Total Intended Award Amount: | $528,077.00 |
Total Awarded Amount to Date: | $528,077.00 |
Funds Obligated to Date: |
FY 2019 = $101,897.00 FY 2020 = $105,468.00 FY 2021 = $109,188.00 FY 2022 = $113,056.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
506 S WRIGHT ST URBANA IL US 61801-3620 (217)333-2187 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
IL US 61820-7473 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Secure &Trustworthy Cyberspace |
Primary Program Source: |
01001920DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT 01002122DB NSF RESEARCH & RELATED ACTIVIT 01002223DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
System intrusions have becoming more subtle and complex. Attackers now covertly observe and probe systems for prolonged periods before launching devastating attacks. In such an environment, it has grown prohibitively difficult for system administrators to identify suspicious events, correlate these events into an attack pattern, and determine an appropriate response. Data Provenance is a method of modeling a system's execution in the form of a causal relationship graph, allowing investigators to trace the ancestry of data objects and identify relationships between seemingly independent events. The goal of the proposed work is to develop techniques that enable the use of data provenance as an expressive and efficient monitoring tool in large distributed systems. These mechanisms will enable unprecedented capability to reason about system events, centrally monitor activities within data centers, and express fine-grained enforcement of security properties based on the historical flow of data. Research and software artifacts will be made available to the broader community through the Linux provenance web site.
The proposed work will examine central challenges related to expressivity and scalability that currently prevent the further proliferation of provenance-based auditing techniques. To address the semantic gap that has traditionally prevented system-layer auditing from being able to explain higher-level application behaviors, this project pursues the design of universal provenance mechanisms that leverage binary analysis to transparently identify siloed application-layer logging activities, extract their semantics, and graft the information onto a causal relationship graph that encodes the entire system's execution. Grammar induction techniques will be leveraged to overcome the tremendous storage burden of provenance and provide a scalable central monitoring framework for data centers. After enriching system-layer auditing and enabling the efficient communication of suspicious activities via provenance traces, data provenance will be integrated into enforcement mechanisms to address critical security challenges including regulatory compliance, information flow control, and fault attribution. The advancement of state-of-the-art of provenance-based tracing and enforcement should establish a new baseline for reasoning about the flow of data in today's complex computing systems.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Data Provenance is a promising cybersecurity technique that represents a series of computer events as a causal relationship graph that describes the history of interactions between computer objects like programs, files, and network connections. This award supported research that improved the precision with which provenance analysis techniques can describe suspicious events in computers, while also simultaneously improving its speed and efficiency. The first major outcome of this program was the ability to transparently capture and represent events from different layers of a computing system in a single unified provenance graph ("Universal Provenance Framework" figure). The second major outcome was a set of techniques to efficiently monitor the provenance many thousands of computers performing highly redundant tasks in data centers. By representing provenance graphs as formal grammars, we combined similar graphs and removed redundancies to create a single global representation of data center activity. This global representation still identified suspicious attack behaviors ("Winnower Graph" figure). The final major outcome was the application of data provenance analysis for access control, regulatory compliance, and attribution in modern computers. One example of this outcome was methods for demonstrating compliance with privacy regulations, such as the EU’s General Data Protection Regulation (GDPR), in provenance form ("GDPR Provenance" figure). Results from these outcomes were published in academic venues and software artifacts were made available to the broader security community. Over the course of the program, this award supported the studies of 8 PhD, 2 Master, and 4 undergraduate students at the University of Illinois at Urbana-Champaign.
Last Modified: 08/07/2024
Modified by: Adam Bates
Please report errors in award information by writing to: awardsearch@nsf.gov.