Award Abstract # 2211538
Collaborative Research: OAC Core: CEAPA: A Systematic Approach to Minimize Compression Error Propagation in HPC Applications

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: THE UNIVERSITY OF IOWA
Initial Amendment Date: July 7, 2022
Latest Amendment Date: July 7, 2022
Award Number: 2211538
Award Instrument: Standard Grant
Program Manager: Amy Apon
awapon@nsf.gov
 (703)292-5184
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 15, 2022
End Date: July 31, 2026 (Estimated)
Total Intended Award Amount: $350,000.00
Total Awarded Amount to Date: $350,000.00
Funds Obligated to Date: FY 2022 = $350,000.00
History of Investigator:
  • Guanpeng Li (Principal Investigator)
    guanpeng-li@uiowa.edu
Recipient Sponsored Research Office: University of Iowa
105 JESSUP HALL
IOWA CITY
IA  US  52242-1316
(319)335-2123
Sponsor Congressional District: 01
Primary Place of Performance: University of Iowa
2 GILMORE HALL
IOWA CITY
IA  US  52242-1320
Primary Place of Performance
Congressional District:
01
Unique Entity Identifier (UEI): Z1H9VJS8NG16
Parent UEI:
NSF Program(s): OAC-Advanced Cyberinfrast Core
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 026Z, 7923, 9150
Program Element Code(s): 090Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Today?s high-performance computing (HPC) applications produce vast volumes of data for post-analysis, presenting a major storage and I/O burden for HPC systems. To significantly reduce this burden, researchers have explored to use lossy compression techniques. While lossy compression can effectively reduce the size of data, it also introduces errors to the compressed data that often lead to incorrect computation results. As a result, scientists hesitate to use lossy compression in their scientific research. Thus, there is a critical need to develop an effective method to identify compression strategies which minimize error impact for a diversity of programs. This project aims to develop a systematic approach that helps scientists automatically select a lossy compression algorithm with the lowest error impact based their HPC programs and target compression ratios. It also integrates educational and outreach activities including student training and development of new curriculum on trustworthy data reduction and dependable HPC systems.

Modeling compression error propagation in HPC programs is challenging because existing lossy compressors are developed with distinct principles that generate largely different compression errors on diverse HPC data. This project includes four key thrusts: (1) developing an accurate and efficient fault injection infrastructure that integrates with the fault models of commonly used lossy compression algorithms; (2) designing a fine-grained approach to characterize error propagation in HPC programs through program analysis and deposition based on the data dependencies and life cycle of compressed data; (3) developing a predictive model using machine learning techniques to select a compression strategy that minimizes the error impact on a given program and compression ratio; and (4) integrating the technique with domain-specific error impact metrics in real-world HPC applications and demonstrates the effectiveness of the technique by selecting compression strategies that give low error impact for the same ratios. Not only this project has an enormous positive impact on HPC cyberinfrastructure, but it also helps redefine the optimization of lossy compression techniques with emphasis on both efficiency and error impact.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Md Hasanur Rahman and Sheng Di and Kai Zhao and Robert Underwood and Guanpeng Li and Franck Cappello "A Feature-Driven Fixed-Ratio Lossy Compression Framework for Real-World Scientific Datasets" IEEE International Conference on Data Engineering (ICDE) , 2023 Citation Details
Md Hasanur Rahman and Sheng Di and Kai Zhao and Robert Underwood and Guanpeng Li and Franck Cappello "A Feature-Driven Fixed-Ratio Lossy Compression Framework for Real-World Scientific Datasets" IEEE International Conference on Data Engineering (ICDE) , 2023 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page