Award Abstract # 0963996
SHF: Medium: Programmable Monitoring Framework for Multicore Systems

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: REGENTS OF THE UNIVERSITY OF CALIFORNIA AT RIVERSIDE
Initial Amendment Date: July 22, 2010
Latest Amendment Date: June 20, 2012
Award Number: 0963996
Award Instrument: Continuing Grant
Program Manager: Sol Greenspan
sgreensp@nsf.gov
 (703)292-7841
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2010
End Date: August 31, 2014 (Estimated)
Total Intended Award Amount: $576,840.00
Total Awarded Amount to Date: $734,040.00
Funds Obligated to Date: FY 2010 = $363,000.00
FY 2011 = $346,840.00

FY 2012 = $24,200.00
History of Investigator:
  • Rajiv Gupta (Principal Investigator)
    gupta@cs.ucr.edu
  • Iulian Neamtiu (Co-Principal Investigator)
Recipient Sponsored Research Office: University of California-Riverside
200 UNIVERSTY OFC BUILDING
RIVERSIDE
CA  US  92521-0001
(951)827-5535
Sponsor Congressional District: 39
Primary Place of Performance: University of California-Riverside
200 UNIVERSTY OFC BUILDING
RIVERSIDE
CA  US  92521-0001
Primary Place of Performance
Congressional District:
39
Unique Entity Identifier (UEI): MR5QC5FCAVH5
Parent UEI:
NSF Program(s): COMPILERS,
Software & Hardware Foundation,
COMPUTER ARCHITECTURE,
PROGRAMMING LANGUAGES
Primary Program Source: 01001011DB NSF RESEARCH & RELATED ACTIVIT
01001112DB NSF RESEARCH & RELATED ACTIVIT

01001213DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7944, 7941, 7924, 9218, 9251, HPCC
Program Element Code(s): 732900, 779800, 794100, 794300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The advent of multicore processors has introduced new opportunities for achieving
increased software performance, reliability, security, and availability. However,
powerful dynamic execution monitoring capabilities are required to realize these
opportunities. This project addresses the challenges of developing a Dynamic
Binary Translation based monitoring framework for parallel applications running on
multicore systems. The programmability of the framework will enable realization of
benefits in achieving enhanced performance, reliability, security, and availability.

Some of the instrumentation code required in context of parallel applications
must be executed by a core in response to events that involve other cores. In particular,
events relevant to many performance, reliability, and security related tasks correspond
to the manifestation of interprocessor data dependences due to updates of shared
memory locations by multiple cores. Based upon this observation programmable
architectural mechanisms will be provided that not only enable the detection of
interprocessor dependence events but also enable the triggering of the execution of
application specific monitoring code. This project will then employ these mechanisms
for improving performance via speculative parallelism, enabling debugging via a
novel strategy of execution suppression, improving reliability via an approach that
allows applications to automatically recover from failures, providing security via
dynamic detection of mutating viruses, and software availability via dynamic updates.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

C. Lin, V. Nagarajan, and R. Gupta "Efficient Sequential Consistency Using Conditional Fences" International Journal of Parallel Programming , v.40 , 2012 , p.84 10.1007/s10766-011-0176-3
M. Feng, C. Lin, and R. Gupta "PLDS: Partitioning Linked Data Structures for Parallelism," ACM Transactions on Architecture and Code Optimization , v.8 , 2012 , p.1 10.1145/2086696.2086717

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

With the widespread use of computing devices and software in critical tasks, high software reliability and performance are paramount. To ensure that software is running reliably, it is important to monitor its execution to identify faulty behavior and then correct it. Also, since deployed software must handle a wide range of situations, it is important to monitor program's input and dependence characteristics during execution and adapt it to deliver high performance. The research project has delivered techniques to enable efficient runtime monitoring of software, tools for debugging of parallel programs to improve their reliability, and runtime techniques for improving performance of complex real-world applications.

1. Efficient Runtime Monitoring.
Continuous monitoring of running software can result in high runtime overhead, since a significant portion of the compute cycles can be taken up by monitoring activities. In this work we designed lightweight hardware support that can be programmed to perform a wide range of monitoring tasks with minimal runtime overhead. The effectiveness of programmable hardware was demonstrated by using it to perform monitoring needed for software reliability (e.g., record-and-replay) and software performance (e.g., speculative parallel execution). An important contribution of the above work is that it considers the relationship between memory models and runtime monitoring of memory accesses. We have solved the long standing problem of efficiently supporting the sequential consistency memory model which makes the task of producing reliable software much more manageable.

2. Debugging Tools for Parallel Software.
Existing debugging tools provide little guidance for the programmers towards locating the bug source. We have developed a new tool, DrDebug, for efficiently debugging multithreaded programs. By providing several new commands we make the task of examining and analyzing the state of a running program much easier. The insights gained by using these commands helps the user track down the root cause of faulty behavior and improve program understanding, allowing the programmer to modify the program to eliminate faulty behavior. DrDebug works for parallel programs and can efficiently monitor long program runs. Additional support for replaying parts of program execution is provided so that the user can efficiently explore and understand program behavior. The effectiveness of DrDebug was demonstrated by monitoring and debugging of real-world programs containing bugs.

3. Exploiting Parallelism in Real-world Applications.
Modern applications in important domains such as genomics and data mining are characterized by their need for massive computing power both due to their computational complexity and their handling of massive amounts of data. We have observed that the massive amounts of input, intermediate, and output data that these applications must handle causes programmers to develop code which continuously carries out data transfers between files and memory to make the best use of limited available memory. Frequent I/O operations that perform these data transfers introduce dependences that greatly limit our ability to exploit parallelism in hybrid loops (i.e., loops containing a mix of computation and I/O). While much research has been carried out over past several decades on exploiting parallelism, these techniques are mostly applicable to computations that are free of I/O operations. We have developed a novel technique that breaks I/O caused dependences and greatly enhances our ability to exploit parallelism for real-world applications. As an example we studied Velvet, a popular de novo genomic assembler. Velvet must deal with large input sizes and large amounts of intermediate data. For example, for processing of the wheat genome, Velvet must handle an input file that is 15 Gb in size ...

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page