Award Abstract # 0968667
CAREER: Architectural Support for Automated Software Debugging

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: NORTH CAROLINA STATE UNIVERSITY
Initial Amendment Date: December 23, 2009
Latest Amendment Date: February 1, 2012
Award Number: 0968667
Award Instrument: Continuing Grant
Program Manager: Almadena Chtchelkanova
achtchel@nsf.gov
 (703)292-7498
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 17, 2009
End Date: June 30, 2014 (Estimated)
Total Intended Award Amount: $388,229.00
Total Awarded Amount to Date: $388,229.00
Funds Obligated to Date: FY 2008 = $147,007.00
FY 2010 = $82,990.00

FY 2011 = $78,732.00

FY 2012 = $79,500.00
History of Investigator:
  • Huiyang Zhou (Principal Investigator)
    hzhou@ncsu.edu
Recipient Sponsored Research Office: North Carolina State University
2601 WOLF VILLAGE WAY
RALEIGH
NC  US  27695-0001
(919)515-2444
Sponsor Congressional District: 02
Primary Place of Performance: North Carolina State University
2601 WOLF VILLAGE WAY
RALEIGH
NC  US  27695-0001
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): U3NVH931QJJ3
Parent UEI: U3NVH931QJJ3
NSF Program(s): Information Technology Researc,
COMPUTING PROCESSES & ARTIFACT
Primary Program Source: 01000809DB NSF RESEARCH & RELATED ACTIVIT
01001011DB NSF RESEARCH & RELATED ACTIVIT

01001112DB NSF RESEARCH & RELATED ACTIVIT

01001213DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 9216, 9218, HPCC
Program Element Code(s): 164000, 735200
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Given their ever increasing complexity, modern software systems are plagued with software defects, commonly known as bugs. It usually takes significant amount of efforts for software developers to locate the defects after a program failure is observed. Due to the limited on-chip resource at the time, traditional architectural support for debugging was limited to a basic set of primitive functions like breakpoints and watchpoints. With the advances in semiconductor technology, the resource constraint is less of a concern and much more powerful architectural support becomes possible to be implemented to ease software debugging. In this research, novel software-hardware integrated approaches are developed to automatically pinpoint software defects and the aim is to develop a computer that can automatically pinpoint the faulty code in either sequential or parallel programs and potentially generate a fix to the defect.

Previous work on architectural support for debugging mainly focused on one aspect of debugging activities including faithfully reproducing program failures or detecting potential bugs. In comparison, this research introduces novel architectural support for: bug detection to report potential bugs, bug isolation to find the relevant bugs based on cause-effect relationship between the potential bugs and the program failure, and bug validation to generate quick fixes to the isolated bugs, thereby forming a complete process of automated debugging. Bugs in both sequential and parallel programs are the target in this research. For parallel programs, the research investigates thread interaction under the transactional memory programming model and develops novel automated debugging schemes for concurrency bugs. The research also includes the prototype of the novel architectural supports to evaluate their effectiveness with real-world applications.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Dimitrov, Martin; Zhou, Huiyang "Combining Local and Global History for High Performance Data Prefetching" Journal of Instruction Level Parallelism , v.13 , 2011
J. Kong, O. Aciçmez, J.-P. Seifert and H. Zhou "Architecting Against Software Cache-based Side Channel Attacks" IEEE Transactions on Computers , v.NA , 2013 , p.1-14
S. Gupta, P. Xiang, Y. Yang, and H. Zhou "Locality principle revisited: A probability-based quantitative approach" Journal of Parallel and Distributed Computing , 2013
Y. Yang and H. Zhou "The Implementation of a High Performance GPGPU Compiler" International Journal on Parallel Programming , v.NA , 2013 , p.1-13

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project focused on integrated hardware and software support for automated software debugging. The scope of the research covers software bugs in both sequential and parallel programs. Besides bugs affecting the correctness of program execution, performance bugs and security bugs have also been studied. A number of approaches have been designed and evaluated. Some software developed in the project has been released as open source code, including an automated debugging tool for sequential bugs, a tool to analyze concurrency bugs, and a compiler for optimizing GPGPU (general purpose computation on graphics processor units) programs. A total of 11 graduate students participated in the project and 6 of them finished their Ph.D. degrees and 2 completed their M.S. theses. Numerous papers have been published in premium venues of the computer architecture and compiler research. Two new courses have been developed based on the researching findings of the project.

On correctness bugs in sequential software programs, an anomaly-based automated debugging framework is developed. The run-time anomalies are monitored and used as predictions of potential bugs. Then, predicted bugs are isolated based on their cause-effect relationship to the final incorrect execution result or events leading to computer hung or crashes. The isolated bug predictions are further validated by altering the anomalous results and then re-examining the program outputs. On concurrency bugs, non-determinism makes them hard to reason. An architectural support is designed to report time-ordered event traces to programmers. In essence, it serves the role of a black box that records the time-ordered function call traces right before program failures. The study in this project on concurrency bugs in large commercial software confirms the effectiveness of this time-order event trace.

On security bugs, this project focuses on mitigating architectural side-channel attacks, cache-based attacks in particular. The root causes of cache-based attacks are analyzed and identified. Three software-hardware integrated designs are developed and evaluated to show the effectiveness in security enhancements and the associated performance impacts. 

On performance bugs, an empirical study is performed on a wide range of open-source GPGPU programs to identify six common program patterns that may lead to suboptimal performance. For each pattern, the prognosis and fix are developed and evaluated. To reduce the effort from programmers,various automating compiler techniques and architectural enhancements are a key result from this project. Such techniques span from multi-core processors, many-core GPU processors, and heterogeneous computing platforms. 

These research findings from this project are published in conference papers, journal papers, and theses/dissertations. Open source code is also distributed to help reproduce the experimental results and reveal the implementation details.

 

 


Last Modified: 10/07/2014
Modified by: Huiyang Zhou

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page