NSF Award Search: Award # 1016974 - CSR: Small: Collaborative Research: Generalized Reliability-Aware Power Management for Real-Time Embedded Systems

Award Abstract # 1016974

CSR: Small: Collaborative Research: Generalized Reliability-Aware Power Management for Real-Time Embedded Systems

NSF Org:	CNS Division Of Computer and Network Systems
Recipient:	THE UNIVERSITY OF TEXAS AT SAN ANTONIO
Initial Amendment Date:	July 19, 2010
Latest Amendment Date:	July 22, 2011
Award Number:	1016974
Award Instrument:	Continuing Grant
Program Manager:	Marilyn McClure mmcclure@nsf.gov (703)292-5197 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	August 1, 2010
End Date:	July 31, 2013 (Estimated)
Total Intended Award Amount:	$185,379.00
Total Awarded Amount to Date:	$185,379.00
Funds Obligated to Date:	FY 2010 = $60,160.00 FY 2011 = $125,219.00
History of Investigator:	Dakai Zhu (Principal Investigator) dakai.zhu@utsa.edu
Recipient Sponsored Research Office:	University of Texas at San Antonio 1 UTSA CIR SAN ANTONIO TX US 78249-1644 (210)458-4340
Sponsor Congressional District:	20
Primary Place of Performance:	University of Texas at San Antonio 1 UTSA CIR SAN ANTONIO TX US 78249-1644
Primary Place of Performance Congressional District:	20
Unique Entity Identifier (UEI):	U44ZMVYU52U6
Parent UEI:	U44ZMVYU52U6
NSF Program(s):	CSR-Computer Systems Research
Primary Program Source:	01001011DB NSF RESEARCH & RELATED ACTIVIT 01001112DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7923
Program Element Code(s):	735400
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

For decades, computer system design and operation were driven largely by high performance objectives. Yet, as the large scale integration of semi-conductor devices is approaching its physical limits, energy efficiency and robustness have been recently promoted to first-class design constraints. Energy efficiency is mandated by the emergence of small foot-print, portable, and battery-powered computers as well as ever-increasing power density that puts stringent constraints even on computers connected to the power grid. Moreover, recent research has revealed that aggressive power management techniques can significantly increase vulnerabilities of computer systems to transient faults (soft errors) that can cause incorrect operations at run-time. These problems are even more pronounced for real-time embedded systems that must perform correctly at high reliability levels, under strict timing and energy constraints.

In recent past, a number of pioneering reliability-aware power management schemes were proposed that aim at mitigating the negative effects of the popular dynamic voltage and frequency scaling. This project is addressing the conservatism of the existing solutions and developing a more general framework. Specifically, the project is devising novel solutions to achieve arbitrary reliability levels through the use of shared recovery tasks. In addition, the research is extending the framework to multiprocessor and emerging multicore platforms. The project has two major broader impact dimensions: First, energy-awareness has a direct impact on environment, economy, and society at large. Second, by promoting reliability to a first-order objective, the project will help to prevent malfunctions in safety-critical computer systems and protect property and human lives.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Baoxian Zhao, Hakan Aydin and Dakai Zhu "Shared Recovery for Energy Efficiency and Reliability Enhancements in Real-Time Applications with Precedence Constraints" ACM Transactions on Design Automation of Electronic Systems (TODAES) , v.18 , 2013

Dakai Zhu, Xuan Qi, Daniel Mosse and Remi Melhem "An Optimal Boundary-Fair Scheduling Algorithm for Multiprocessor Real-Time Systems" Journal of Parallel and Distributed Computing , v.71 , 2011 , p.1411

Han, Jian-Jun; Wu, Xiaodong; Zhu, Dakai; Jin, Hai; Yang, Laurence T.; Gaudiot, Jean-Luc "Synchronization-Aware Energy Management for VFI-Based Multicore Real-Time Systems" IEEE TRANSACTIONS ON COMPUTERS , v.61 , 2012 , p.1682-1696

Xuan Qi and Dakai Zhu "Energy-Efficient Block-Partitioned Multicore Processors for Parallel Applications" Journal of Computer Science and Technology, Special Issue on High-Performance Computing for Embedded Multicore Systems , v.26 , 2011 , p.418

Xuan Qi, Dakai Zhu and Hakan Aydin "Cluster Scheduling for Real-Time Systems: Utilization Bounds and Run-Time Overhead" Real-Time Systems: The International Journal of Time- Critical Computing Systems, Special Issue on Embedded and Real-Time Computing Systems and Applications , v.47 , 2011 , p.253

Xuan Qi, Dakai Zhu and Hakan Aydin "Global Scheduling Based Reliability-Aware Power Management for Multiprocessor Real-Time Systems" Real-Time Systems: The International Journal of Time-Critical Computing Systems, Special Issue on Energy Aware Real-Time Systems , v.47 , 2011 , p.109

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

As the large scale integration of semi-conductor devices is approaching its physical limits, energy efficiency and robustness have been recently promoted to first-class design constraints. Moreover, recent research has revealed that aggressive power management techniques can significantly increase vulnerabilities of computer systems to transient faults (soft errors) that can cause incorrect operations at run-time. Hence, it becomes necessary to manage system energy consumption and reliability simultaneously, especially for energy constrained safety-critical real-time embedded systems.

As the outcomes of this project, we designed and developed several energy-efficient fault-tolerance schemes that can overcome the limitation and conservatism of the existing reliability-aware power management (RAPM) framework. Specifically, we studied the shared-recovery (SHR) technique that allows task to share a sinlge recovery task and leave more slack time for energy savings. For multiprocessor real-time systems, we developed the global scheduling based RAPM schemes with both individual and shared recovery being consdered for independent tasks as well as tasks with precedence constraints. Moreover, to achieve arbitrary reliability objectives,
reliability-oriented energy management schemes are developed. To incorporate the tolerance of permanent faults, the standby-spare techniques are designed for periodic tasks runing on both uniprocessor and multiprocessor systems. Finally, a novel preference-oriented scheduling framework is designed, which can schedule primary and backup tasks more effectivelly for better energy savings. The developed techniques can have a profound impact with their abilities to achieve a safer computing environment with better energy-efficiency.

Last Modified: 09/26/2013
Modified by: Dakai Zhu

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error