
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | June 29, 2022 |
Latest Amendment Date: | May 19, 2024 |
Award Number: | 2212465 |
Award Instrument: | Standard Grant |
Program Manager: |
Varun Chandola
OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2022 |
End Date: | August 31, 2025 (Estimated) |
Total Intended Award Amount: | $574,640.00 |
Total Awarded Amount to Date: | $584,640.00 |
Funds Obligated to Date: |
FY 2024 = $10,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
1500 HORNING RD KENT OH US 44242-0001 (330)672-2070 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
OFFICE OF THE COMPTROLLER KENT OH US 44242-0001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | OAC-Advanced Cyberinfrast Core |
Primary Program Source: |
01002425DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
For years, scientists have continued to improve the performance of the simulations where resilience was neglected. This approach was driven by a lack of understanding of the "cause and effect" in resilience analysis. While the current resilience analysis tool continues to lack transparency and interpretability, it is critical that the importance of resilience analysis is promoted and that scientists are educated on its criticality. This project's novelties are redefining the resilience analysis in terms of interpretability and explainability. The approach is significantly different from existing endeavors. It can explain or identify the logic behind these predictions and differentiate the functions and usages of the existing tools built on different theories. The project's impacts include designing a new resilience assessment system using visualization and DevOps to enable transparent resilience analysis, vulnerability positioning, and automation of resilience continuous integration. The project work with NSF and DoE-sponsored supercomputing centers to adopt the system with proven success. Graduate and undergraduate students, especially from underrepresented groups, will be trained in multiple disciplines that will enable them to have successful careers in computing/scientific research areas that are becoming increasingly interdisciplinary.
This project builds upon existing knowledge to create a new insightful approach that enables the resilience property of scientific applications to be assessed under the inevitable existence of surging soft errors in next-generation high-performance computing systems. This project will bring further clarity, insight, and understanding into how systems behave while running high-performance computing scientific workloads composed of parallel simulations for data generation, big data analytics, and machine learning to extract data insights in scientific research. The project proposes 1.) the design and implementation of an error propagation analysis platform, which creates interpretable visualization of the critical paths and critical sections of the codes; 2.) analytics to allow domain scientists to compare and contrast the different resilience models on the simulation codes; 3.) a continuous resilience assessment (Resilience CI) that can be integrated into a standard continuous integration to automate the procedure; whereby the resilience property between committed versions will be delivered to developers as a standard report and to support the DevOps of exa-scale scientific applications; and 4.) quantum chemistry workflow will participate in the evaluation as the driver applications. The project's outcomes, such as tutorials, collected data, and the visualization software system, can encourage the application developers to incorporate cost-effective fault tolerance strategies. In addition, the investigators will incorporate research outcomes in new courses and tutorials for the workforce training. The project will engage and advance the partnership with the industry for commercialization.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.