
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | July 5, 2018 |
Latest Amendment Date: | June 2, 2020 |
Award Number: | 1840934 |
Award Instrument: | Standard Grant |
Program Manager: |
Sol Greenspan
CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2018 |
End Date: | September 30, 2022 (Estimated) |
Total Intended Award Amount: | $229,087.00 |
Total Awarded Amount to Date: | $253,087.00 |
Funds Obligated to Date: |
FY 2019 = $16,000.00 FY 2020 = $8,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
4000 CENTRAL FLORIDA BLVD ORLANDO FL US 32816-8005 (407)823-0387 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
Orlando FL US 32816-8005 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Software & Hardware Foundation |
Primary Program Source: |
01001920DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Highly-configurable systems, e.g., the Linux kernel, form our most critical infrastructure, underpinning everything from high-performance computing clusters to IoT devices. Keeping these systems secure and reliable with automated tools is essential. However, tool support is lacking for such systems because of the complexity and scale of their configurability. This leaves some of the most critical software with some of the least tool support. The problem is that most software tools are not variability-aware; that is, they do not account for the many configurations of the software. Serious defects, including null pointer errors and buffer overflows, can and do appear in specific configurations, making them hard to find without accounting for variability. The goal of this project is to advance the state of the art for systems development and debugging, resulting in more secure and less error-prone systems, benefiting the millions who rely on highly-configurable software infrastructure.
To solve these challenges, this project aims to develop the infrastructure, analysis techniques, and language support for debugging and maintaining configurable software systems written in C-family languages, currently lacking for software developers. The first part of the project is to develop a front-end infrastructure that captures these sources of variability in a new intermediate representation. Such reusable infrastructure is crucial to the development of state-of-the-art analyses. The second part seeks to create variability-aware versions of static analyses and propose new inter-procedural analyses that enable tradeoffs between scalability and precision. While static analysis has proven useful for detecting bugs, accounting for configurations increases the complexity of analysis. Systematic extensions to bug detection algorithms based on these new analyses can target previously obscured bugs. Since the C preprocessor has long been recognized as a source of problems, the third part of this project is to develop new language extensions to C, supplanting preprocessor usage and enabling compiler support for variability specifications. Translators to the new language based on our front-end analysis infrastructure will enable existing software to benefit from the new language. The PIs on this project will mentor graduate students and are committed to promoting female and under-represented minority participation. Artifacts developed in this project will be used in courses to introduce students to state-of-the-art software tool development.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Tool support is lacking for highly-configurable systems, such as the Linux kernel, because of the complexity and scale of their configurability. The problem is that most software tools do not account for the many configurations of the software, i.e., are variability-oblivious. However, bugs such as null pointer errors and buffer overflows can appear in arbitrary configurations, making them hard to find with existing tools. The main goal of this project is to advance the state-of-the-art in program analysis for highly-configurable C systems software. The project includes the development of an analysis infrastructure that enables variability bug-finding. Achieving this requires overcoming the precision and scalability challenges to the underlying analysis algorithms.
We introduced a new framework for generating benchmarks of variability-aware bugs. This framework simulates variability-aware analysis using off-the-shelf bug detectors by running them on samples of configurations. Unlike prior techniques for developing variability bug datasets, our approach finds bugs known to be discoverable by state-of-the-art bug finding tools. Therefore, the results are applicable to evaluating the variability-aware analysis that we developed under this project.
For the front-end of our variability-aware analysis development, we designed and implemented a scalable desugaring transformation, SugarC, that translates unpreprocessed C to pure C. This closes the gap between existing variability-oblivious and variability-aware analyses by converting configurable C code into pure C. The variability remains encoded in C, which can be analyzed by existing analyses. To evaluate support for desugaring C constructs, we created a new benchmark called DesugarBench, showing that SugarC supports many more constructs than prior works, especially the kinds of challenging cases found in real-world C.
We developed two parallel efforts for exploring variability-aware analysis on top of the front-end. First, VarAlyzer is an end-to-end variability-aware dataflow analysis. VarAlyzer was evaluated by conducting a typestate analysis that checks for correct API usage. Second, Sugarlyzer is an extensible framework that enables the integration of many existing variability-oblivious tools. To demonstrate the extensibility of Sugarlyzer, we integrate three popular static analyzers (Clang, Infer, and Phasar) into Sugarlyzer. The integration only requires dozens of lines of code to implement. We have run all three integrated tools on a variability bug dataset, VBDb, in order to assess Sugarzlyer?s correctness and effectiveness. The results show that Sugarlyzer is able to detect the vast majority of variability bugs present in the dataset (78/105), compared to a baseline that exhaustively tests all configurations.
Our analysis of macro usage has yielded formal properties describing the transformability of macro usage. Specifically, it categorizes macros by their semantic equivalence to C function. We used these properties to determine what macros are transformable without any change to the interface of the macro. Furthermore, we implemented these properties in a lightweight static analysis which informs our transformer that rewrites equivalent macros to C function, thereby removing the preprocessor usage.
We used our build system constraint extraction algorithms to generate valid build configuration for the Linux kernel. This process found build errors, which we patched and reported to the Linux developers. We have released a new version of the constraint analysis tool publicly and had one accepted build system patch to the Linux kernel source.
We used ConfigFuzz to transform six common fuzzing targets and carried out the evaluation using the AFL and AFL++ fuzzers. ConfigFuzz shows better performance than two baseline setups in four targets, while on the other two targets, ConfigFuzz does not always outperform the baselines. We analyze the target programs? source code and the options fuzzed by ConfigFuzz to reason about the fuzzing performance. We also show that parameterizing ConfigFuzz to fuzz configurations with up to 2 options often leads to higher code coverage than up to 1 option, while fuzzing many more options with ConfigFuzz may decrease the performance.
The research results from this grant were disseminated to several competitive conferences and journals including, but not limited to, the International Conference of Software Engineering (ICSE), Transactions on Software Engineering and Methodology (TOSEM), and the Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). Nine peer-review publications were produced that were funded in part by this grant. Artifacts produced during this grant have been disseminated publicly in repositories containing both the software source code, experimental scripts, and resulting data.
Ten graduate and six undergraduate students were funded in part from this research, including five from groups underrepresented in computing. Two of the graduate students were doctoral students who graduated with dissertation work funded in part by the grant and one masters student graduated during the grant, all of whom now work in the software industry. Five graduate courses received content based on grant research, including a graduate course on configurable software, an independent study on configurable software, and courses on operating systems and compilers that incorporate material related to the grant.
Last Modified: 01/15/2023
Modified by: Paul Gazzillo
Please report errors in award information by writing to: awardsearch@nsf.gov.