
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | July 19, 2010 |
Latest Amendment Date: | July 22, 2011 |
Award Number: | 1016974 |
Award Instrument: | Continuing Grant |
Program Manager: |
Marilyn McClure
mmcclure@nsf.gov (703)292-5197 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | August 1, 2010 |
End Date: | July 31, 2013 (Estimated) |
Total Intended Award Amount: | $185,379.00 |
Total Awarded Amount to Date: | $185,379.00 |
Funds Obligated to Date: |
FY 2011 = $125,219.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
1 UTSA CIR SAN ANTONIO TX US 78249-1644 (210)458-4340 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
1 UTSA CIR SAN ANTONIO TX US 78249-1644 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CSR-Computer Systems Research |
Primary Program Source: |
01001112DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
For decades, computer system design and operation were driven largely by high performance objectives. Yet, as the large scale integration of semi-conductor devices is approaching its physical limits, energy efficiency and robustness have been recently promoted to first-class design constraints. Energy efficiency is mandated by the emergence of small foot-print, portable, and battery-powered computers as well as ever-increasing power density that puts stringent constraints even on computers connected to the power grid. Moreover, recent research has revealed that aggressive power management techniques can significantly increase vulnerabilities of computer systems to transient faults (soft errors) that can cause incorrect operations at run-time. These problems are even more pronounced for real-time embedded systems that must perform correctly at high reliability levels, under strict timing and energy constraints.
In recent past, a number of pioneering reliability-aware power management schemes were proposed that aim at mitigating the negative effects of the popular dynamic voltage and frequency scaling. This project is addressing the conservatism of the existing solutions and developing a more general framework. Specifically, the project is devising novel solutions to achieve arbitrary reliability levels through the use of shared recovery tasks. In addition, the research is extending the framework to multiprocessor and emerging multicore platforms. The project has two major broader impact dimensions: First, energy-awareness has a direct impact on environment, economy, and society at large. Second, by promoting reliability to a first-order objective, the project will help to prevent malfunctions in safety-critical computer systems and protect property and human lives.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
As the large scale integration of semi-conductor devices is approaching its physical limits, energy efficiency and robustness have been recently promoted to first-class design constraints. Moreover, recent research has revealed that aggressive power management techniques can significantly increase vulnerabilities of computer systems to transient faults (soft errors) that can cause incorrect operations at run-time. Hence, it becomes necessary to manage system energy consumption and reliability simultaneously, especially for energy constrained safety-critical real-time embedded systems.
As the outcomes of this project, we designed and developed several energy-efficient fault-tolerance schemes that can overcome the limitation and conservatism of the existing reliability-aware power management (RAPM) framework. Specifically, we studied the shared-recovery (SHR) technique that allows task to share a sinlge recovery task and leave more slack time for energy savings. For multiprocessor real-time systems, we developed the global scheduling based RAPM schemes with both individual and shared recovery being consdered for independent tasks as well as tasks with precedence constraints. Moreover, to achieve arbitrary reliability objectives,
reliability-oriented energy management schemes are developed. To incorporate the tolerance of permanent faults, the standby-spare techniques are designed for periodic tasks runing on both uniprocessor and multiprocessor systems. Finally, a novel preference-oriented scheduling framework is designed, which can schedule primary and backup tasks more effectivelly for better energy savings. The developed techniques can have a profound impact with their abilities to achieve a safer computing environment with better energy-efficiency.
Last Modified: 09/26/2013
Modified by: Dakai Zhu
Please report errors in award information by writing to: awardsearch@nsf.gov.