
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | August 2, 2016 |
Latest Amendment Date: | August 2, 2016 |
Award Number: | 1617967 |
Award Instrument: | Standard Grant |
Program Manager: |
Matt Mutka
CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2016 |
End Date: | February 28, 2021 (Estimated) |
Total Intended Award Amount: | $300,000.00 |
Total Awarded Amount to Date: | $300,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
5200 N LAKE RD MERCED CA US 95343-5001 (209)201-2039 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
CA US 95343-5001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CSR-Computer Systems Research |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Heterogeneous computing is becoming crucial for many computational fields, including simulations of the galaxy, analysis of social networks, modeling of stock transactions, and so on. Programming heterogeneous memory systems is a grand challenge, and creates a major obstacle between heterogeneous hardware and applications because of the programming complexity and fast hardware evolution. This project aims to address this obstacle, and is expected to significantly relieve programmers from handling the underlying memory system heterogeneity. The outcome from this research will also enable continuous enhancement of the computing efficiency of a number of applications on future heterogeneous systems, which is a critical condition for sustained advancement of science, health, security and other aspects of humanity.
To address the programming challenges on heterogeneous memory systems, the project investigates a software framework, consisting of a hardware specification language, a set of novel compiler and runtime techniques, and advanced memory performance modeling. The goal is to develop a systematic solution to automatically place data given a complex heterogeneous memory system, especially on massively parallel platforms. With the proposed framework, programmers are relieved from tailoring their programs to different memory systems, and at the same time, the sophisticated memory systems can get fully translated into high computing efficiency. The framework transforms the programs such that they are customized - in terms of where data are placed in memory, when and how to migrate, etc. - to the underlying heterogeneous memory system at runtime and attain a near optimal memory usage.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Memory heterogeneity means multiple memory components with different properties (such as memory bandwidth, latency, capacity, and computing ability) form a memory system. Memory heterogeneity is becoming common because of the needs of increasing memory capacity or providing higher performance in a cost-effective way. Memory heterogeneity raises challenges on deciding the optimal placement of data objects on heterogeneous memory (HM). Recent studies indicate substantial difficulty of matching applications with HM because of the complex and fast changing nature of HM as well as application input sensitivity and phase behaviors.
Intellectual merit. We study system-level solutions to make best usage of HM for high performance. Our solutions are essentially based on the ideas of introducing limited application semantics information to direct data migration and allocation. Using application semantics, we are able to break fundamental tradeoff between memory profiling overhead and accuracy, and decide when to trigger data migration to maximize the overlap between data migration and computation to minimize data migration overhead.
Furthermore, we study application-level solutions to make best usage of HM for high performance. We propose new data structures and algorithms to reduce expensive accesses to slow memory as much as possible. Those solutions are application-specific, but bring much higher performance than system-level solutions; Those solutions focus on critical and common applications, which justifies highly customization of the solutions for those applications. Both the system-level solutions and application-level solutions investigate principles on how a large amount of memory pages should be profiled to capture spatial and temporal localities without paying large overhead and how page migration should happen to fully utilize fast memory.
Broader impact. This project enables applications to fully tap the large memory capacity provided by HM. Some of those applications are critical to the nation interests (such as the DOE application WarpX, a large-scale plasma simulation code); Some of them are critical to the business (such as deep learning training and fast information retrieval). With our solutions, those applications are able to run in unprecedented scales on a single machine, even performing better than on multiple machines. This project has been highlighted by several medias and companies (e.g., towardsdatascience.com, Microsoft, and linkreseacher.com). This project lays foundation for many HPC applications (including compute-intensive applications with small memory) to leverage HM with large memory capacity. This project is among the first efforts that reveal using limited application semantics can be significantly helpful to improve application performance on HM.
Furthermore, this project provides research opportunities to undergraduate students to gain hands-on experiences on software-hardware co-designs. This project is also based on collaboration with Lawrence Berkeley National Lab and Lawrence Livermore National Lab. The project has impacts on how the future supercomputer infrastructure should be built. Collaborating with the national labs, we provide training opportunities to graduate students and prepare them for future career in the HPC field. Since the HPC field, which is critical to the national interests, lacks workforce. Our project is helpful to address this pressing problem.
Last Modified: 04/13/2021
Modified by: Dong Li
Please report errors in award information by writing to: awardsearch@nsf.gov.