Skip to feedback

Award Abstract # 2312673
CAREER: A Highly Effective, Usable, Performant, Scalable Data Reduction Framework for HPC Systems and Applications

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: TRUSTEES OF INDIANA UNIVERSITY
Initial Amendment Date: January 10, 2023
Latest Amendment Date: January 10, 2023
Award Number: 2312673
Award Instrument: Standard Grant
Program Manager: Juan Li
jjli@nsf.gov
 (703)292-2625
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: January 1, 2023
End Date: December 31, 2027 (Estimated)
Total Intended Award Amount: $467,770.00
Total Awarded Amount to Date: $467,770.00
Funds Obligated to Date: FY 2023 = $467,770.00
History of Investigator:
  • Dingwen Tao (Principal Investigator)
    ditao@iu.edu
Recipient Sponsored Research Office: Indiana University
107 S INDIANA AVE
BLOOMINGTON
IN  US  47405-7000
(317)278-3473
Sponsor Congressional District: 09
Primary Place of Performance: Indiana University
107 S INDIANA AVE
BLOOMINGTON
IN  US  47405-7000
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): YH86RTW2YVJ4
Parent UEI:
NSF Program(s): CAREER: FACULTY EARLY CAR DEV
Primary Program Source: 01002324DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045
Program Element Code(s): 104500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This CAREER project researches and develops novel algorithms and software to improve the efficacy, usability, performance, and scalability of data reduction for high-performance computing (HPC) systems and applications. It contributes to the cyberinfrastructure (CI) of big data management for HPC applications in many domains such as cosmology, climatology, seismology, and machine learning. The research findings will be widely disseminated through open-source software packages and publications in premier conferences and journals. An integrated educational and outreach program is designed to foster CI workforce development, including integration of concepts and use of data reduction in curricula, research training for undergraduate and graduate students, and a specially designed training program for scientists and engineers from universities and national labs.

This CAREER project simultaneously addresses these four critical issues in scientific data reduction through comprehensive analytical modeling and architectural performance optimization. Specific scientific contributions include: (1) it builds lightweight models to accurately estimate the compression ratio and quality of different techniques in the prediction and encoding stages of prediction-based compression, and optimizes the compression configurations to maximize the compression ratio under compression quality constraints; (2) it develops new efficient predictors and lossless encoding methods for lossy compression of scientific data on GPUs with deep architectural optimizations to achieve both high throughput and ratio; and (3) it deeply integrates the optimized compression with parallel I/O and MPI libraries with a series of optimizations to improve the performance of data movements and the scalability of HPC applications. The success of this research agenda enables scientists and engineers to well address the increasingly severe challenge of scientific data explosion.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Guo, Anqi and Hao, Yuchen and Wu, Chunshu and Haghi, Pouya and Pan, Zhenyu and Si, Min and Tao, Dingwen and Li, Ang and Herbordt, Martin and Geng, Tong "Software-Hardware Co-design of Heterogeneous SmartNIC System for Recommendation Models Inference and Training" The 37th ACM International Conference on Supercomputing (ICS 2023) , 2023 https://doi.org/10.1145/3577193.3593724 Citation Details
Wang, Daoce and Pulido, Jesus and Grosset, Pascal and Jin, Sian and Tian, Jiannan and Zhao, Kai and Ahrens, James and Tao, Dingwen "TAC+: Optimizing Error-Bounded Lossy Compression for 3D AMR Simulations" IEEE Transactions on Parallel and Distributed Systems , 2023 https://doi.org/10.1109/TPDS.2023.3339474 Citation Details
Wang, Daoce and Pulido, Jesus and Grosset, Pascal and Tian, Jiannan and Jin, Sian and Tang, Houjun and Sexton, Jean and Di, Sheng and Zhao, Kai and Fang, Bo and Luki, Zarija and Cappello, Franck and Ahrens, James and Tao, Dingwen "AMRIC: A Novel In Situ Lossy Compression Framework for Efficient I/O in Adaptive Mesh Refinement Applications" , 2023 https://doi.org/10.1145/3581784.3613212 Citation Details
Xiang, Lizhi and Yin, Miao and Zhang, Chengming and Sukumaran-Rajam, Aravind and Sadayappan, P. and Yuan, Bo and Tao, Dingwen "TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition" The 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP 2023) , 2023 https://doi.org/10.1145/3572848.3577478 Citation Details
Zhang, Boyuan and Tian, Jiannan and Di, Sheng and Yu, Xiaodong and Feng, Yunhe and Liang, Xin and Tao, Dingwen and Cappello, Franck "FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs" The 32nd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2023) , 2023 https://doi.org/10.1145/3588195.3592994 Citation Details
Zhang, Boyuan and Tian, Jiannan and Di, Sheng and Yu, Xiaodong and Swany, Martin and Tao, Dingwen and Cappello, Franck "GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs" The 37th ACM International Conference on Supercomputing (ICS 2023) , 2023 Citation Details
Zhang, Chengming and Smith, Shaden and Sun, Baixi and Tian, Jiannan and Soifer, Jonathan and Yu, Xiaodong and Song, Shuaiwen Leon and He, Yuxiong and Tao, Dingwen "HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs" , 2023 https://doi.org/10.1145/3577193.3593717 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page