Award Abstract # 1318955
TWC TTP: Small: Mitigating Insider Attacks in Provenance Systems

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF ARIZONA
Initial Amendment Date: September 9, 2013
Latest Amendment Date: September 9, 2013
Award Number: 1318955
Award Instrument: Standard Grant
Program Manager: Nina Amla
namla@nsf.gov
 (703)292-7991
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2013
End Date: September 30, 2017 (Estimated)
Total Intended Award Amount: $496,066.00
Total Awarded Amount to Date: $496,066.00
Funds Obligated to Date: FY 2013 = $496,066.00
History of Investigator:
  • Christian Collberg (Principal Investigator)
    collberg@cs.arizona.edu
  • Saumya Debray (Co-Principal Investigator)
  • Sudha Ram (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Arizona
845 N PARK AVE RM 538
TUCSON
AZ  US  85721
(520)626-6000
Sponsor Congressional District: 07
Primary Place of Performance: University of Arizona
AZ  US  85721-0077
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): ED44Y3W6P7B9
Parent UEI:
NSF Program(s): Secure &Trustworthy Cyberspace
Primary Program Source: 01001314DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7434, 7923
Program Element Code(s): 806000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The digital provenance of a digital object gives a history of its life cycle including its creation, update, and access. It thus provides meta-level information about the sequence of events that lead up to the current version of the object, as well as its chain of custody. Such provenance information can be used for a variety of purposes, such as identifying the origins of a document, assessing the quality or reliability of data, and detecting undesirable actions such as forgery or unauthorized alteration of data. However, all of these practical uses of provenance information presuppose that the provenance system is secure, i.e. that provenance data is collected, processed, and stored in a manner that ensures its confidentiality and integrity. Without such guarantees, users can get an incorrect impression of document authenticity, potentially with significant real-world consequences.

This project investigates the design of secure provenance collection systems where the collected meta-data can be relied upon even in light of realistic insider attack models. Security, however, is not sufficient; a practical system must also be efficient even when large amounts of fine-grained provenance data needs to be stored and processed. The project is aimed at addressing both issues through the following three objectives. (1) Techniques for continuously updatable software tamperproofing to ensure the integrity of the system itself. (2) Techniques for robust, continuous marking, collusion-free, text fingerprinting to mitigate document leakage. (3) Techniques for anonymous storage on untrusted storage servers to allow for efficient storage and access of fine-grained provenance data.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 11)
A Tool for Teaching Reverse Engineering "Clark Taylor and Christian Collberg" 2016 USENIX Workshop on Advances in Security Education (ASE '16) , 2016 https://www.usenix.org/conference/ase16/workshop-program/presentation/taylor
Babak Yadegari and Saumya Debray "Bit-Level Taint Analysis" Proceedings of the 14th IEEE International Working Conference on Source Code Analysis and Manipulation , 2014
Babak Yadegari and Saumya Debray "Control Dependencies in Interpretive Systems" Proceedings of the 17th International Conference on Runtime Verification (RV 2017) , 2017
Babak Yadegari and Saumya Debray "Symbolic Execution of Obfuscated Code" ACM Conference on Computer and Communication Security (CCS) , 2015
Babak Yadegari and Saumya Debray "Symbolic Execution of Obfuscated Code" Proceedings of the 22nd ACM Conference on Computer and Communications Security (CCS) , 2015
Babak Yadegari, Jon Stephens, and Saumya Debray "Analysis of Exception-Based Control Transfers" ACM Conference on Code and Data Security and Privacy (CODASPY) , 2017
Sebastian Banescu, Christian Collberg, Alexander Pretschner "Predicting the Resilience of Obfuscated Code Against Symbolic Execution Attacks via Machine Learning" USENIX SECURITY , 2017 https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/banescu
Sebastian Banescu, Vijay Ganesh, Zack Newsham, Alexander Pretschner, Christian Collberg "Code Obfuscation Against Symbolic Execution Attacks" Annual Computer Security Applications Conference (ACSAC) , 2016 10.1145/2991079.2991114
Yuichiro Kanzak, Akito Monden, Christian Collberg "Code Artificiality: A Metric for the Code Stealth Based on an N-Gram Model" 2015 IEEE/ACM 1st International Workshop on Software Protection , 2015 10.1109/SPRO.2015.14
Yuichiro Kanzaki, Clark Thomborson, Akito Monden, and Christian Collberg "Pinpointing and Hiding Surprising Fragments in an Obfuscated Program" In Proc. of the 5th Program Protection and Reverse Engineering Workshop (PPREW-5) , 2015 10.1145/2843859.2843862
Yuichiro Kanzaki, Clark Thomborson, Akito Monden, and Christian Collberg "Pinpointing and Hiding Surprising Fragments in an Obfuscated Program" In Proc. of the 5th Program Protection and Reverse Engineering Workshop (PPREW-5) , 2015
(Showing: 1 - 10 of 11)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

In this project we investigated the design of a system that employs fine-grained provenance capture, text watermarking, and software protection techniques to protect office documents against leakage, spoofing, and unauthorized modification. Our attack model assumed a powerful insider who edits documents on their local machine, who at times is disconnected from the network, and who is in complete control over hardware, operating system, and application binaries. Such a scenario is often termed the Man-At-The-End (MATE) scenario.


Concretely, the project designed and built a secure provenance system for digital documents (available for download from http://collberg.github.io/provenance/downloads.html), specifically text documents for the OpenOffice office suite. The project addressed an important aspect of secure provenance, namely how to securely trace the chain of custody of a document that leaks outside the system. To protect documents from such leakage attacks the project developed algorithms for robust, continuous marking, collusion-free, text fingerprinting.


In addition, the project found that advancements in software protection techniques (techniques to protect against a MATE attack) are necessary to build and evaluate secure provenance systems. Specifically, the research focused on building the software protection defense techniques necessary to support secure provenance systems, techniques for evaluating such defenses, benchmarks to be used in evaluations, and techniques for attacking protected systems.


Much of this work addressed symbolic analysis, since symbolic/concolic analysis is an important component of many reverse engineering (i.e. attack) techniques. As such, it is used both by adversaries wanting to understand or manipulate a program under their control and by computer security expert wanting to analyze malware samples. Our work shed new light on both sides of this coin: we designed reverse engineering algorithms based on symbolic analysis that are generic, i.e. not targeted at undoing a particular obfuscating transformation, and we designed new obfuscation algorithms that make symbolic analysis less effective. These insights will be important in the future cat-and-mouse game between software protection attackers and defenders.


With respect to software protection defense techniques, the project designed new obfuscation transformations that change program behavior in subtle yet acceptable ways, and we showed that they can render symbolic-execution based deobfuscation analysis ineffective in practice. The project also developed a generic dynamic obfuscation algorithm as part of our Tigress code obfuscation tool (available for download from tigress.cs.arizona.edu).

With respect to software protection evaluation techniques, the project proposed a method for evaluating the artificiality of protected code. The goal is to measure the stealth of the protected code, i.e. the degree to which protected code can be distinguished from unprotected code. The results show that static obfuscating transformations have little effect on artificiality. However, dynamic obfuscating transformations, or a technique that inserts junk code fragments into the program, tend to increase the artificiality. Based on these results, we developed a defense method which aims to improve the stealth of obfuscated code. The project also addressed the problem of characterizing the resilience of code obfuscation transformations against automated symbolic execution attacks. The results show that many existing obfuscation transformations, such as virtualization, stand little chance of withstanding symbolic execution based deobfuscation. Finally, the project developed a general framework for choosing the most relevant software features to estimate the effort of automated attacks. We showed that features such as the number of community structures in the graph-representation of symbolic path-constraints are far more relevant for predicting deobfuscation time than other features generally used to measure the potency of control-flow obfuscation (e.g. cyclomatic complexity).

With respect to software protection attack techniques, the project investigated a number of generic attacks against software protections. Specifically, the research developed a generic technique for deobfuscation based on semantics-based simplification of runtime execution traces. The approach is based on the view that a program defines a transformation from input values to output values, and that deobfuscation can be considered as the process of identifying and simplifying this transformation.  The project also showed how symbolic execution of obfuscated code can be made more precise by an application of architecture-aware bit-precise taint analysis. Finally, the project developed an approach for identifying self-checksumming anti-tampering defenses.



 


Last Modified: 01/04/2018
Modified by: Christian S Collberg

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page