Award Abstract # 1824303
SPX: Collaborative Research: Harnessing the Power of High-Bandwidth Memory via Provably Efficient Parallel Algorithms

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: CARNEGIE MELLON UNIVERSITY
Initial Amendment Date: February 27, 2018
Latest Amendment Date: February 27, 2018
Award Number: 1824303
Award Instrument: Standard Grant
Program Manager: Damian Dechev
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: January 1, 2018
End Date: August 31, 2022 (Estimated)
Total Intended Award Amount: $249,999.00
Total Awarded Amount to Date: $249,999.00
Funds Obligated to Date: FY 2017 = $249,999.00
History of Investigator:
  • Benjamin Moseley (Principal Investigator)
    moseleyb85@gmail.com
Recipient Sponsored Research Office: Carnegie-Mellon University
5000 FORBES AVE
PITTSBURGH
PA  US  15213-3815
(412)268-8746
Sponsor Congressional District: 12
Primary Place of Performance: Carnegie-Mellon University
5000 Forbes Avenue
Pittsburgh
PA  US  15213-3815
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): U3NKNFLNQ613
Parent UEI: U3NKNFLNQ613
NSF Program(s): PPoSS-PP of Scalable Systems
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 026Z, 9150
Program Element Code(s): 042Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

An important bottleneck for many parallel scientific applications is memory performance. Recently, vendors have introduced a new memory called high-bandwidth memory (HBM) as an approach to alleviate this bottleneck. This project will develop an algorithmic foundation for using HBM. The project has the potential for a broad economic, technological, and scientific impact, since industry has an investment in this technology and many of the nation's strategic codes are being run on HBM-capable machines. The PIs will integrate research with education at the graduate and undergraduate levels by training PhD, MS, and honors program BS students in cross-cutting issues encompassing algorithm design, high-performance software, and processor architecture.

The new approach offered by vendors is to bond memory (HBM) directly to the processor chip, which allows for more connections, enabling higher bandwidth. Although the size of the new memory is larger than modern on-chip caches, physical constraints limit the capacity of the memory to be significantly smaller than DRAM. HBM does not cleanly fit in the standard memory hierarchy. This project will develop a foundational understanding of how to algorithmically design codes for HBM enhanced architectures. Overcoming these intellectual challenges to achieve multi-core scalability using HBM requires new algorithms, models, and abstractions, spearheaded by this collaboration between researchers who study hardware issues, high performance computing challenges, and theoretical modeling and analysis.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 41)
Lu, K. and Lavastida, T. and Moseley, B. and Wang, Y. "Scaling Average-Linkage via Sparse Cluster Embeddings" Asian Conference on Machine Learning , 2021 Citation Details
Moseley, B. "Scheduling to Approximate Minimization Objectives on Identical Machines" ICALP , 2019 https://doi.org/10.4230/LIPIcs.ICALP.2019.86 Citation Details
Abo Khamis, Mahmoud and Curtin, Ryan and Im, Sungjin and Moseley, Ben and Ngo, Hung and Pruhs, Kirk and Samadian, Alireza "An Approximation Algorithm for the Matrix Tree Multiplication Problem" Symposium on Mathematical Foundations of Computer Science , v.202 , 2021 https://doi.org/10.4230/LIPIcs.MFCS.2021.6 Citation Details
Abo Khamis, Mahmoud and Curtin, Ryan R. and Moseley, Benjamin and Ngo, Hung Q. and Nguyen, XuanLong and Olteanu, Dan and Schleich, Maximilian "On Functional Aggregate Queries with Additive Inequalities" Symposium on Principles of Database Systems , 2019 https://doi.org/10.1145/3294052.3319694 Citation Details
Leichter, Marilena and Moseley, Benjamin and Pruhs, Kirk "On the impossibility of decomposing binary matroids" Operations Research Letters , v.50 , 2022 https://doi.org/10.1016/j.orl.2022.09.003 Citation Details
Abo-Khamis, M. and Im, S. and Moseley, B. and Pruhs, K. and Samadian, A. "Approximate Aggregate Queries Under Additive Inequalities" Symposium on Algorithmic Principles of Computer Systems (APOCS) , 2021 https://doi.org/10.1137/1.9781611976489.7 Citation Details
Abo-Khamis, M. and Im, S. and Moseley, B. and Pruhs, K. and Samadian, A. "A Relational Gradient Descent Algorithm For Support Vector Machine Training" Symposium on Algorithmic Principles of Computer Systems (APOCS) , 2021 https://doi.org/10.1137/1.9781611976489.8 Citation Details
Agrawal, Kunal and Lee, I-Ting Angelina and Li, Jing and Lu, Kefu and Moseley, Benjamin "Practically Efficient Scheduler for Minimizing Average Flow Time of Parallel Jobs" 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) , 2019 10.1109/IPDPS.2019.00024 Citation Details
Ahmadian, S and Epasto, A and Knittel, M and Kumar, R and Mahdian, M and Moseley, B and Pham, P and Vassilvitskii, S and Wang, Y "Fair Hierarchical Clustering" Advances in neural information processing systems , 2020 https://doi.org/ Citation Details
Benjamin Moseley, Maxim Sviridenko "Submodular Optimization with Contention Resolution Extensions" APPROX , 2019 https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2019.3 Citation Details
Berg, Benjamin and Harchol-Balter, Mor and Moseley, Benjamin and Wang, Weina and Whitehouse, Justin "Optimal Resource Allocation for Elastic and Inelastic Jobs" Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2020) , 2020 10.1145/3350755.3400265 Citation Details
(Showing: 1 - 10 of 41)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project developed an algorithmic foundation for managing high-bandwidth memory (HBM).  HBM is a new type of memory technology that has higher bandwidth than traditional DRAM, but similar latency.  HBM's bandwidth is so high because it is placed directly onto the processor package (unlike DRAM).  HBM thus augments the existing memory hierarchy by providing a memory that can be accessed with up to 5x higher bandwidth than DDR4 (today's DRAM technology) when feeding a CPU, and up to 20x higher bandwidth when feeding a GPU.


The challenge with modeling HBM is that it does not cleanly fit in the standard memory hierarchy.  This project developed a foundational understanding of how to algorithmically design codes for HBM enhanced architectures.  Specifically, the project developed new algorithms, models, and abstractions, as a result of a tightly coupled collaboration between researchers who study hardware issues, HPC challenges, and theoretical modeling and analysis.  Among our key findings is the fact that automatic HBM management is indeed possible in multicore systems. Among the successes from this project were a series of papers that show the theoretical underpinning of how to manage high-bandwidth memory automatically, establish tight bounds for parallel paging, and desmonstrate algorithms for online parallel paging with optimal makespan.  Our works have shown how to manage HBM automatically, as efficiently as a standard cache, and with provably good performance guarantees.  

 


Last Modified: 02/16/2023
Modified by: Benjamin J Moseley

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page