
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | August 13, 2008 |
Latest Amendment Date: | July 7, 2009 |
Award Number: | 0834798 |
Award Instrument: | Standard Grant |
Program Manager: |
Marilyn McClure
mmcclure@nsf.gov (703)292-5197 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2008 |
End Date: | August 31, 2013 (Estimated) |
Total Intended Award Amount: | $402,904.00 |
Total Awarded Amount to Date: | $418,904.00 |
Funds Obligated to Date: |
FY 2009 = $16,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
3720 S FLOWER ST FL 3 LOS ANGELES CA US 90033 (213)740-7762 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
3720 S FLOWER ST FL 3 LOS ANGELES CA US 90033 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CSR-Computer Systems Research |
Primary Program Source: |
01000910DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The future of information technology industry depends on designing computer systems that are tolerant of errors caused by variations in device characteristics. Traditionally system reliability is achieved by replicating critical system components. Since variability induced errors occur slowly over time, replication for the sole purpose of providing reliability is prohibitively expensive for low cost computing platforms. This research explores using 3D stacking to implement redundant components and variability monitoring circuitry on a 3D stacked die. Using 3D stacking the redundant computation blocks can be built using a variation resilient process technology that may be slower than the process technology used for building the primary processor. This research takes a holistic approach to designing the 3D stacked monitoring spanning from innovative microarchitecture solutions to exploiting application's inherent error tolerance. On the microarchitecture front, this research explores the potential for seamlessly reconfiguring the monitoring layer to act in three modes: performance assists, when variability induced errors are rare, or as guard processors, when variability induced errors begin to appear, or as backup processors, when device aging may result in irreparable errors on the primary processing substrate. On the architecture front, this research explores a new exception class called Reliability Aware Exceptions that allow microarchitecture blocks to raise an exception in response to a variability induced error. These software visible exceptions can then be exploited by application classes that are inherently error tolerant and can customized exception handling mechanisms.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.