Award Abstract # 1657175
CRII: SHF: ACI: Performance-in-Depth Sparse Solvers for Heterogeneous Parallel Platforms.

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: RUTGERS, THE STATE UNIVERSITY
Initial Amendment Date: February 9, 2017
Latest Amendment Date: July 7, 2018
Award Number: 1657175
Award Instrument: Standard Grant
Program Manager: Almadena Chtchelkanova
achtchel@nsf.gov
 (703)292-7498
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: February 15, 2017
End Date: January 31, 2020 (Estimated)
Total Intended Award Amount: $175,000.00
Total Awarded Amount to Date: $175,000.00
Funds Obligated to Date: FY 2017 = $175,000.00
History of Investigator:
  • Narayan Mandayam (Principal Investigator)
    narayan@winlab.rutgers.edu
  • Maryam Mehri Dehnavi (Co-Principal Investigator)
  • Maryam Mehri Dehnavi (Former Principal Investigator)
Recipient Sponsored Research Office: Rutgers University New Brunswick
3 RUTGERS PLZ
NEW BRUNSWICK
NJ  US  08901-8559
(848)932-0150
Sponsor Congressional District: 12
Primary Place of Performance: Rutgers University New Brunswick
715, 96 Frelinghuysen Road
Piscataway
NJ  US  08854-8018
Primary Place of Performance
Congressional District:
06
Unique Entity Identifier (UEI): M1LVPE5GLSD9
Parent UEI:
NSF Program(s): CRII CISE Research Initiation
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7942, 8228
Program Element Code(s): 026Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Sparse numerical computations are at the heart of many science and engineering simulations. However, the complex irregularities in sparse methods limit the performance of many scientific software. This project integrates mathematical reformulation, algorithm redesign, and performance engineering to develop high-performance sparse solvers for heterogeneous parallel platforms. The outcomes of this research are innovative tools and methodologies that advance the field of large-scale scientific simulations. In addition, the project has a broader impact in training graduate students to perform interdisciplinary research.

The project conducts an in-depth investigation of performance bottlenecks in sparse solvers and reformulates their standard variants to deliver end-to-end performance. Cross-layer solutions are developed to improve data locality, reduce communication, and increase inherent parallelism in sparse linear solvers. The solutions involve multi-level algorithm restructuring and performance tuning to significantly improve the scalability and performance of sparse computations while preserving their numerical accuracy, convergence, and stability. The proposed methods and algorithms are implemented as domain-specific high-performance software and a benchmark suite to promote iterative improvements of the developed algorithms and codes.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Blanco, Zachary and Liu, Bangtian and Dehnavi, Maryam Mehri "CSTF: Large-Scale Sparse Tensor Factorizations on Distributed Platforms" ICPP 2018 Proceedings of the 47th International Conference on Parallel Processing , 2018 10.1145/3225058.3225133 Citation Details
Cheshmi, Kazem and "ParSy: inspection and transformation of sparse matrix computations for parallelism" SC '18 Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis , 2018 Citation Details
Cheshmi, Kazem and Kamil, Shoaib and Strout, Michelle Mills and Dehnavi, Maryam Mehri "ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism" SC18: International Conference for High Performance Computing, Networking, Storage and Analysis , 2018 10.1109/SC.2018.00065 Citation Details
Cheshmi, Kazem and Kamil, Shoaib and Strout, Michelle Mills and Dehnavi, Maryam Mehri "Sympiler: transforming sparse matrix codes by decoupling symbolic analysis" SC '17 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2017 10.1145/3126908.3126936 Citation Details
Liu, Bangtian and Wen, Chengyao and Sarwate, Anand D. and Dehnavi, Maryam Mehri "A Unified Optimization Approach for Sparse Tensor Operations on GPUs" Cluster Computing (CLUSTER), 2017 IEEE International Conference on , 2017 10.1109/CLUSTER.2017.75 Citation Details
Liu, Bangtian Mills and Cheshmi, Kazem Mehri and Soori, Saeed and Strout, Michelle and Dehnavi, Maryam "MatRox: modular approach for improving data locality in hierarchical (Mat)rix App(Rox)imation" PPoPP '20: Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , 2020 10.1145/3332466.3374548 Citation Details
Soori, Saeed and Devarakonda, Aditya and Blanco, Zachary and Demmel, James and Gurbuzbalaban, Mert and Dehnavi, Maryam Mehri "Reducing Communication in Proximal Newton Methods for Sparse Least Squares Problems" ICPP 2018 Proceedings of the 47th International Conference on Parallel Processing , 2018 10.1145/3225058.3225131 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

 

Sparse numerical computations are at the heart of many science and engineering simulations. However, the complex irregularities in sparse methods limit the performance of many scientific software. In this project we integrated mathematical reformulation, algorithm redesign, and performance engineering to develop high-performance sparse solvers and software for heterogeneous parallel platforms. The outcomes of the project were the invention of novel algorithms and inspection strategies that analyze the irregularity in sparse matrix computations in scientific simulations. The analysis was used to build domain-specific code generators and cloud engines, specifically the MatRox, Sympiler, and ASYNC frameworks. These frameworks use multi-level algorithm restructuring and performance tuning to significantly improve the scalability and performance of sparse computations while preserving their numerical accuracy, convergence, and stability. As a result, practitioners and domain experts that use MatRox, Sympiler, and ASYNC can automatically generate high-performance code in scientific and machine learning applications. The frameworks are developed with a domain-specific language to improve programmer productivity.   


The project has produced nine peer-reviewed publications. Code generation frameworks Sympiler and MatRox as well as the cloud computing engine ASYNC are the software artifacts produced from the project. These frameworks are made publicly available. Several graduate and undergraduate students were trained in numerical analysis, compiler development, cloud computing, and optimization methods. The trainees working on the project have won numerous awards such as the ACM Grant Final Student Research Competition award and the 2018 Adobe Research fellowship.

 

 


Last Modified: 03/09/2020
Modified by: Narayan Mandayam

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page