NSF Award Search: Award # 1464244

Award Abstract # 1464244

CRII: CI: Scalable Multigrid Algorithms for Solving Elliptic PDEs on Power-Efficient Clusters

NSF Org:	OAC Office of Advanced Cyberinfrastructure (OAC)
Recipient:	UNIVERSITY OF UTAH
Initial Amendment Date:	May 7, 2015
Latest Amendment Date:	May 7, 2015
Award Number:	1464244
Award Instrument:	Standard Grant
Program Manager:	Sushil K Prasad OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering
Start Date:	July 1, 2015
End Date:	June 30, 2018 (Estimated)
Total Intended Award Amount:	$175,000.00
Total Awarded Amount to Date:	$175,000.00
Funds Obligated to Date:	FY 2015 = $175,000.00
History of Investigator:	Hari Sundar (Principal Investigator) hari.sundar@tufts.edu
Recipient Sponsored Research Office:	University of Utah 201 PRESIDENTS CIR SALT LAKE CITY UT US 84112-9049 (801)581-6903
Sponsor Congressional District:	01
Primary Place of Performance:	University of Utah UT US 84112-8930
Primary Place of Performance Congressional District:	01
Unique Entity Identifier (UEI):	LL8GLEVH6MG3
Parent UEI:
NSF Program(s):	CYBERINFRASTRUCTURE, EDUCATION AND WORKFORCE, EPSCoR Co-Funding
Primary Program Source:	01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	8228, 9150
Program Element Code(s):	723100, 736100, 915000
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

While emerging extreme-scale computing systems could provide unprecedented resources for scientific discovery, two major challenges are the cost and the energy required to run and cool these systems. The system-on-a-chip (SoC) components widely used in the mobile device market are substantially cheaper and more energy efficient compared to desktop or server processors, and represent a promising option for future systems. This project addresses three challenges for emerging extreme-scale computing systems: the potential move to mobile processors, the increasing levels of concurrency, and the need for energy efficiency. Shared infrastructure is developed to accelerate interdisciplinary and collaborative research. This is a first step in the development of mathematical and computational methods for solving scientific computing problems on low-energy systems that can reduce the overall cost of scientific discoveries and promote the progress of science.

The project develops a scalable and power efficient parallel multigrid solver for elliptic partial differential equations (PDEs) that targets emerging extreme scale computing systems. Elliptic PDEs are ubiquitous in natural, engineered and societal systems, and the efficient multigrid solvers being developed as part of this project are beneficial to research across several disciplines. The project also develops a power-performance model to aid in application-controlled power-performance management, at the per-node level. Motivated by the slower interconnections common on low-power clusters, the project develops a new class of parallel algorithms that lower the power utilization of computations to overlap with the communication. This is the reverse of what has conventionally been done, where communication costs are hidden by overlapping with computation. Additionally, the algorithms utilize compute nodes that are not computationally active at all times -- a radically different approach that creates a new class of energy-efficient scalable parallel algorithms. The research will be evaluated initially using a 16 node Tegra/ARM-based cluster and ultimately, the CloudLab cluster at the University of Utah. The developed software will be disseminated using an open source license. The scalability experiments will be run on the NSF-supported CloudLab cluster, hosted at the University of Utah, allowing other users to re-create both the hardware and software stack used for the experiments. The resulting system will be among the first large scale low-energy clusters available anywhere.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fernando, Isuru Dilanka and Jayasena, Sanath and Fernando, Milinda and Sundar, Hari "A Scalable Hierarchical Semi-Separable Library for Heterogeneous Clusters" 46th International Conference on Parallel Processing (ICPP) , 2017 10.1109/ICPP.2017.60 Citation Details

Fernando, Milinda and Duplyakin, Dmitry and Sundar, Hari "Machine and Application Aware Partitioning for Adaptive Mesh Refinement Applications" Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing , v.2017 , 2017 10.1145/3078597.3078610 Citation Details

Isuru Fernando, Sanath Jayasena, Milinda Fernando, Hari Sundar "A Heterogeneous Hierarchical Semi-Separable Library for Heterogeneous Clusters" 46th International Conference on Parallel Processing (ICPP-2017) , 2017

Majid Rasouli, Vidhi Zala "Improving Performance and Scalability of Algebraic Multigrid through a Specialized MATVEC" IEEE High Performance Extreme Computing Conference , 2018 Citation Details

Majid Rasouli, Vidhi Zala, Robert M. Kirby, Hari Sundar "Improving Performance and Scalability of Algebraic Multigrid through a Specialized MATVEC" High Performance Extreme Computing Conference (HPEC) , 2018

Milinda Fernando, Dmitry Duplyakin, Hari Sundar "Machine and Application Aware Partitioning for Adaptive Mesh Refinement Applications" ACM High-Performance Parallel and Distributed Computing conference, June 2017, Washington D.C., USA , 2017 http://dx.doi.org/10.1145/3078597.3078610

Milinda Fernando, Dmitry Duplyakin, Hari Sundar "Machine and Application Aware Partitioning for Adaptive Mesh Refinement Applications" Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing , 2018 10.1145/3078597.3078610

Nishith Tirpankar, Hari Sundar "Towards Triangle Counting on GPU using Stable Radix binning" High Performance Extreme Computing Conference (HPEC) , 2018

Tirpankar, Nishith and Sundar, Hari "Towards Triangle Counting on GPU using Stable Radix Binning" IEEE High Performance Extreme Computing Conference , 2018 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The central goal of this project was to develop scalable multigrid algorithms on power-efficient clusters. Research was carried out in two major thrust areas related to developing strategies and algorithms for energy and power-efficient (1) single-node and (2) parallel scalability in this context.

These objectives we achieved broadly as well as a few additional results were obtained. The specific contributions directly related to the project goals were,

1. Developing machine-aware partitioning algorithms that reduced the overall runtime, as well as the energy and power requirements for scientific computations. (HPDC 2017). This formed the basis of subsequent research and is a fundamental achievement of this proposal.
2. Efficient Multigrid algorithm that reduced overall runtime, improved scalability and maintained portability across several leadership methods. (IEEE HPEC 2018.)

The broader contributions resulting from this research were,

1. Utilizing the new partitioning algorithm, in conjuction with communication reducing approaches, we developed efficient solvers for elliptic systems using a compressed Heirarchical matrix representation on heterogeneous clusters. The use of GPUs greatly reduced the energy footprint of the overall computation. (ICPP 2017.)
2. Extending the idea to graph analytics, we were able to leverage the ideas developed as part of this proposal for multigrid methods and apply them to graph analytics. In particular, we developed a novel energy-efficient triangle counting algoriuthm on the GPU. (IEEE HPEC 2018). This work has an important impact on community detection and other problems in computational social sciences.

The specific objectives were to develop new algorithms and strategies for computation that is both fast (runtime) as well as energy efficient. A secondary objective was to have control over the power requirements of the method, so as to be able to afford portability across multiple architectures. Both of these objectives were met. The key finding is that a significant bottleneck, for both runtime and energy, is data movement, and therefore minimizing data movement and limiting it to high-bandwidfth channels results in an overall faster and efficient algorithm. While initially done in the context of Multigrid algorithms, we were able to extend our methods to a broader class of problems that share the core data structure of a distributed sparse matrix. These included compressed matrix representations and graph analytics.

Last Modified: 09/02/2018
Modified by: Hari Sundar

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error