
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | August 14, 2012 |
Latest Amendment Date: | August 14, 2012 |
Award Number: | 1228765 |
Award Instrument: | Standard Grant |
Program Manager: |
Nina Amla
namla@nsf.gov (703)292-7991 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2012 |
End Date: | August 31, 2017 (Estimated) |
Total Intended Award Amount: | $397,708.00 |
Total Awarded Amount to Date: | $397,708.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
105 JESSUP HALL IOWA CITY IA US 52242-1316 (319)335-2123 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
14 MacLean Hall Iowa City IA US 52242-1419 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Secure &Trustworthy Cyberspace |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The security of our software is critical for consumer confidence, the protection of privacy and valuable intellectual property, and of course national security. Because of our society's increased reliance on software, security breaches can lead to serious personal or corporate losses, and endanger the privacy, liberties, and even the lives of individuals. As threats to software security have become more sophisticated, so too have the techniques and analyses developed to improve it. Symbolic execution has emerged as a fundamental tool for security applications. Its main idea is to run a program using symbolic instead of concrete values: a set of symbols are assigned to the program inputs, and the outputs are expressed as a set of "verification conditions", logical formulas over the input symbols. A number of successful security analyses use symbolic execution and similar methods to recast security questions about programs as constraint satisfaction problems in some formal logic. Automatic reasoners for that logic can then be used to solve those problems. In the last few years, solvers based on Satisfiability Modulo Theories (SMT) techniques have become a natural choice in such approaches to security because of their superior performance and automation compared to more traditional theorem provers and their greater generality with respect to ad-hoc tools or propositional satisfiability solvers.
This collaborative project brings together experts in security and in SMT to pursue two complementary research goals: (i) harness the full power of SMT solvers to improve current security tools based on symbolic analysis; and (ii) design and develop new techniques to address the needs of anticipated future security applications. Specific activities addressing these goals include: collecting challenge benchmark problems from existing security analyses and developing targeted SMT optimizations for these benchmarks; developing appropriate security abstractions in the SMT language used to express security verification conditions; developing logical theories and algorithms for reasoning about character strings in such verification conditions; exposing a general framework for extending the verification condition language; and developing techniques for computing symbolic solution sets for SMT constraints. These activities are expected to (i) significantly increase the flexibility, performance, and reasoning capabilities of SMT solvers in support of security applications; (ii) improve the performance and scalability of current security analyses by leveraging the reasoning power of SMT solvers; and (iii) provide a foundation for new, more powerful, and more expressive security analyses. Overall, this project will help create more scalable and expressive security applications which could have a considerable impact on society as they enable the production of software much more resistant to security attacks.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The security of our software is critical for consumer confidence, protection of privacy, protection of valuable intellectual property, and national security. As threats to software security have become more sophisticated, so too have the techniques developed to ensure security. A fundamental technique is symbolic execution which enables several security analyses such as automatic exploit generation, malware analysis, and input filtering. In this technique, a program is executed with symbolic instead of concrete inputs, making it is possible to analyze many different program executions at the same time by using as reasoning backend a solver for symbolic constraints. Backend solvers based on Satisfiability Modulo Theories (SMT) methods are a natural choice in symbolic execution-based approaches to security because of: (i) their superior performance and automation compared to more traditional solvers; and (ii) their greater generality with respect to ad-hoc tools and propositional solvers.
In this project, we developed several new or improved subsolvers for SMT that are useful for security analyses, such as one for the theory of strings, and integrated them in the open-source SMT solver CVC4. In parallel with that, we developed and implemented a novel symbolic execution technique that leverages these new capabilities. The technique, which we dubbed Veritesting, increases the scalability of (dynamic) symbolic execution by combining it with (static) program verification. Its main idea is to verify only selected parts of a program and do so only in a specific dynamic context, the one currently being symbolically executed. This results in a dramatic reduction in the number of the generated constraints for the SMT solver with respect to traditional symbolic execution, which in turn yields a significant improvement in its scalability. Veritesting enabled during this project the discovery of over a thousand of security bugs in widely used Linux programs, which we reported to the Linux community. Additionally, we investigated the use of symbolic execution for the analysis of Python programs, with a focus on exploiting to the new string reasoning capabilities of CVC4. Building on previous work by others, we developed PyEx, an open-source symbolic execution tool with increased robustness and test coverage, and used it to study the discovery of security vulnerability in open source Python code, much of which includes string operations.
Of the new subsolvers for CVC4, the first specializes on the theory of bitvectors. This theory is at the heart of most program analyses as it can be used to reason about binary code at the level of machine operations. Our new solver combines two approaches (called lazy and eager) to achieve maximum performance. The implementation in CVC4 has ranked near the top in the annual SMT solver competitions for the last few years and is being used in security applications in industry (for example, at Google). The second subsolver is for a rich theory of character strings. String manipulation is at the heart of several high-profile security attacks, including SQL injection and cross-site scripting attacks. The subsolver can be used to automatically reason about code that might be vulnerable to such attacks. Thanks to it, CVC4 can solve a very large majority (> 93%) of benchmarks in a large suite consisting of problems generated from PyEx and other security tools. Finally, we developed a theory solver for finite sets with cardinality constraints and extended it to reason about finite relations as well. Sets and relations are a key mathematical building block in many abstractions and reasoning systems. By providing a native theory solver for them that can be combined with other theories (including bitvectors and strings), we made it is possible to model and reason about a wide variety of operations and abstractions.
CVC4, which is available online at http://cvc4.cs.stanford.edu, is being used in a large number of academic and industrial projects and applications, including security applications. The new capabilities developed in this project will directly benefit those efforts.
Last Modified: 12/09/2017
Modified by: Cesare Tinelli
Please report errors in award information by writing to: awardsearch@nsf.gov.