
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | September 9, 2013 |
Latest Amendment Date: | September 9, 2013 |
Award Number: | 1318955 |
Award Instrument: | Standard Grant |
Program Manager: |
Nina Amla
namla@nsf.gov (703)292-7991 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2013 |
End Date: | September 30, 2017 (Estimated) |
Total Intended Award Amount: | $496,066.00 |
Total Awarded Amount to Date: | $496,066.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
845 N PARK AVE RM 538 TUCSON AZ US 85721 (520)626-6000 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
AZ US 85721-0077 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Secure &Trustworthy Cyberspace |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The digital provenance of a digital object gives a history of its life cycle including its creation, update, and access. It thus provides meta-level information about the sequence of events that lead up to the current version of the object, as well as its chain of custody. Such provenance information can be used for a variety of purposes, such as identifying the origins of a document, assessing the quality or reliability of data, and detecting undesirable actions such as forgery or unauthorized alteration of data. However, all of these practical uses of provenance information presuppose that the provenance system is secure, i.e. that provenance data is collected, processed, and stored in a manner that ensures its confidentiality and integrity. Without such guarantees, users can get an incorrect impression of document authenticity, potentially with significant real-world consequences.
This project investigates the design of secure provenance collection systems where the collected meta-data can be relied upon even in light of realistic insider attack models. Security, however, is not sufficient; a practical system must also be efficient even when large amounts of fine-grained provenance data needs to be stored and processed. The project is aimed at addressing both issues through the following three objectives. (1) Techniques for continuously updatable software tamperproofing to ensure the integrity of the system itself. (2) Techniques for robust, continuous marking, collusion-free, text fingerprinting to mitigate document leakage. (3) Techniques for anonymous storage on untrusted storage servers to allow for efficient storage and access of fine-grained provenance data.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
In this project we investigated the design of a system that employs fine-grained provenance capture, text watermarking, and software protection techniques to protect office documents against leakage, spoofing, and unauthorized modification. Our attack model assumed a powerful insider who edits documents on their local machine, who at times is disconnected from the network, and who is in complete control over hardware, operating system, and application binaries. Such a scenario is often termed the Man-At-The-End (MATE) scenario.
Concretely, the project designed and built a secure provenance system for digital documents (available for download from http://collberg.github.io/provenance/downloads.html), specifically text documents for the OpenOffice office suite. The project addressed an important aspect of secure provenance, namely how to securely trace the chain of custody of a document that leaks outside the system. To protect documents from such leakage attacks the project developed algorithms for robust, continuous marking, collusion-free, text fingerprinting.
In addition, the project found that advancements in software protection techniques (techniques to protect against a MATE attack) are necessary to build and evaluate secure provenance systems. Specifically, the research focused on building the software protection defense techniques necessary to support secure provenance systems, techniques for evaluating such defenses, benchmarks to be used in evaluations, and techniques for attacking protected systems.
Much of this work addressed symbolic analysis, since symbolic/concolic analysis is an important component of many reverse engineering (i.e. attack) techniques. As such, it is used both by adversaries wanting to understand or manipulate a program under their control and by computer security expert wanting to analyze malware samples. Our work shed new light on both sides of this coin: we designed reverse engineering algorithms based on symbolic analysis that are generic, i.e. not targeted at undoing a particular obfuscating transformation, and we designed new obfuscation algorithms that make symbolic analysis less effective. These insights will be important in the future cat-and-mouse game between software protection attackers and defenders.
With respect to software protection defense techniques, the project designed new obfuscation transformations that change program behavior in subtle yet acceptable ways, and we showed that they can render symbolic-execution based deobfuscation analysis ineffective in practice. The project also developed a generic dynamic obfuscation algorithm as part of our Tigress code obfuscation tool (available for download from tigress.cs.arizona.edu).
With respect to software protection evaluation techniques, the project proposed a method for evaluating the artificiality of protected code. The goal is to measure the stealth of the protected code, i.e. the degree to which protected code can be distinguished from unprotected code. The results show that static obfuscating transformations have little effect on artificiality. However, dynamic obfuscating transformations, or a technique that inserts junk code fragments into the program, tend to increase the artificiality. Based on these results, we developed a defense method which aims to improve the stealth of obfuscated code. The project also addressed the problem of characterizing the resilience of code obfuscation transformations against automated symbolic execution attacks. The results show that many existing obfuscation transformations, such as virtualization, stand little chance of withstanding symbolic execution based deobfuscation. Finally, the project developed a general framework for choosing the most relevant software features to estimate the effort of automated attacks. We showed that features such as the number of community structures in the graph-representation of symbolic path-constraints are far more relevant for predicting deobfuscation time than other features generally used to measure the potency of control-flow obfuscation (e.g. cyclomatic complexity).
With respect to software protection attack techniques, the project investigated a number of generic attacks against software protections. Specifically, the research developed a generic technique for deobfuscation based on semantics-based simplification of runtime execution traces. The approach is based on the view that a program defines a transformation from input values to output values, and that deobfuscation can be considered as the process of identifying and simplifying this transformation. The project also showed how symbolic execution of obfuscated code can be made more precise by an application of architecture-aware bit-precise taint analysis. Finally, the project developed an approach for identifying self-checksumming anti-tampering defenses.
Last Modified: 01/04/2018
Modified by: Christian S Collberg
Please report errors in award information by writing to: awardsearch@nsf.gov.