NSF Award Search: Award # 1319448

Award Abstract # 1319448

SHF: Small: Embedded Graph Software-Hardware Models and Maps for Scalable Sparse Computations

NSF Org:	CCF Division of Computing and Communication Foundations
Recipient:	THE PENNSYLVANIA STATE UNIVERSITY
Initial Amendment Date:	August 6, 2013
Latest Amendment Date:	August 6, 2013
Award Number:	1319448
Award Instrument:	Standard Grant
Program Manager:	Almadena Chtchelkanova achtchel@nsf.gov (703)292-7498 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering
Start Date:	August 1, 2013
End Date:	January 31, 2017 (Estimated)
Total Intended Award Amount:	$424,999.00
Total Awarded Amount to Date:	$424,999.00
Funds Obligated to Date:	FY 2013 = $218,549.00
History of Investigator:	Padma Raghavan (Principal Investigator)
Recipient Sponsored Research Office:	Pennsylvania State Univ University Park 201 OLD MAIN UNIVERSITY PARK PA US 16802-1503 (814)865-1372
Sponsor Congressional District:	15
Primary Place of Performance:	Pennsylvania State Univ University Park 343K IST Bldg University Park PA US 16802-7000
Primary Place of Performance Congressional District:
Unique Entity Identifier (UEI):	NPM2J7MSCF61
Parent UEI:
NSF Program(s):	HIGH-PERFORMANCE COMPUTING
Primary Program Source:	01001314DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7923, 7942
Program Element Code(s):	794200
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

A large number of "big data" and "big simulation" applications, such as those for determining network models or simulations of partial differential equation models, concern high dimensional data that are sparse. Sparse data structures and algorithms present significant advantages in terms of storage and computational costs. However, with only a few operations per data element, efficient and scalable implementations are difficult to achieve on current and emerging high performance computing systems with very high degrees of core level parallelism, complex node interconnect topology and multicore/manycore nodes with non-uniform memory architectures (NUMA). This proposal develops and evaluates á-embedded graph hardware-software models and attendant data locality-preserving and NUMA-aware application to core/thread mappings to enhance performance and parallel scalability.
Consider an application task graph A, weighted with measures of work and data sharing that is approximately embedded in two or three dimensions, to obtain an á-embedded graph A. Additionally, consider a weighted graph of a HPC system that is naturally assigned coordinates to obtain an á-embedded host graph model H. This proposal develops parallel algorithms to compute interconnect topology-aware mappings of A to H in order to optimize performance measures such as congestion and dilation while preserving load balance. Additionally, at a multicore node in H that is assigned a subgraph of A, (i) sparse data are reordered to enhance parallelism and locality, and (ii) a dynamic fine-grain NUMA-aware task scheduling is applied to respond through work-stealing to core variations in performance from resource conflicts, throttling etc. Finally, through insights gained from á-embedded graph models, sparse matrix algorithms are reformulated to enhance communication avoidance, soft error resilience and data preconditioning. Outcomes include enabling weak scaling to a very large number of cores by extracting parallelism at fine, medium and large-grains, and significantly enhanced fixed and scaled problem efficiencies through locality preservation.
The interconnect topology-aware models and maps hold the potential for impact on very large scale HPC workloads through potential incorporation into the Message Passing Interface for enhanced sparse communications. Additionally, the proposed locality-aware mappings and NUMA-aware scheduling can potentially benefit the very large base of modeling and simulation applications that run on small multicore clusters. Graduate student training is enhanced through a "scale-up" challenge component in an interdisciplinary course on computational science and engineering. High school students are introduced to parallel computing through summer in-residence programs seeking to broaden participation in science and engineering from underrepresented communities.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Guillaume Aupy, JeongHyung Park, Padma Raghavan "Locality-Aware Laplacian Mesh Smoothing" 45th International Conference on Parallel Processing, {ICPP} 2016 , 2016 , p.588 http://dx.doi.org/10.1109/ICPP.2016.74

Humayun Kabir, Joshua Dennis Booth, Guillaume Aupy, Anne Benoit, Yves Robert, and Padma Raghavan "STS-k: a multilevel sparse triangular solution scheme for NUMA multicores" Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). ACM, New York, NY, USA, , Article 55 , 11 pages. , 2015 http://dx.doi.org/10.1145/2807591.2807667

Joshua Dennis Booth, Jagadish Kotra, Hui Zhao, Mahmut T. Kandemir, Padma Raghavan: "Phase Detection with Hidden Markov Models for DVFS on Many-Core Processors" 35th IEEE International Conference on Distributed Computing Systems,,ICDCS 2015, Columbus, OH, USA, June 29 - July 2, 2015 , 2015 , p.185 http://dx.doi.org/10.1109/ICDCS.2015.27

Shad Kirmani, Jeonghyung Park, Padma Raghavan "An embedded sectioning scheme for multiprocessor topology-aware mapping of irregular applications" International Journal of High Performance Computing Applications , 2015 , p.http://hp 10.1177/1094342015597082

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error