
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | July 22, 2024 |
Latest Amendment Date: | July 22, 2024 |
Award Number: | 2403012 |
Award Instrument: | Standard Grant |
Program Manager: |
Almadena Chtchelkanova
achtchel@nsf.gov (703)292-7498 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | August 1, 2024 |
End Date: | July 31, 2028 (Estimated) |
Total Intended Award Amount: | $727,999.00 |
Total Awarded Amount to Date: | $727,999.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
1 SILBER WAY BOSTON MA US 02215-1703 (617)353-4365 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
881 Commonwealth Avenue BOSTON MA US 02215-1703 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Software & Hardware Foundation |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Computing systems' ability to efficiently and timely process large amounts of data is a key enabler in the modern landscape of data-driven applications. To bridge the widening gap between memory technology and processors, computing systems continue to rely heavily on complex multi-level cache hierarchies. Caches can prevent costly accesses to downstream memory if the processed data items exhibit good spatiotemporal locality. Unfortunately, locality does not always emerge naturally in complex data processing pipelines. Platform-specific algorithmic optimizations are often necessary to rearrange the algorithm?s memory access pattern for better locality while striving to maintain the original semantics. When operating on high-dimensional objects (e.g., tensors), data locality unlocks crucial performance gains, but it becomes harder to achieve. This project proposes a novel class of architectural data transformation units to be interposed between memory and compute, for example Central Processing Units (CPUs) and Graphics Processing Units (GPUs). By relying on knowledge of the data access pattern followed by the algorithmic semantics, they decouple the in-memory geometry of data items from the access sequence required by the computational logic. As such, they make data items requested sequentially appear to the processing unit?and cache hierarchy?as if they were stored sequentially without data duplication through on-the-fly transformations. This enables spatiotemporal locality to be achieved effortlessly, i.e., without the need for heavy algorithmic re-engineering. The findings will be integrated into undergraduate and graduate courses at Boston University and the University of Kansas, enhancing topics such as data systems, system performance evaluation, embedded real-time systems, and operating systems. The project will support underrepresented populations across educational levels and foster strong industry connections.
This project explores the theory and practice concerning the formulation, design, and implementation of architectural on-the-fly Data Transformation Units (DTUs). It does so by thrusting along three interconnected research avenues. First, the investigators focus on developing a foundational science of on-the-fly data transformation. A key stepping stone is formulating an access pattern specification language that is both expressive and efficiently interpretable in hardware. In the second thrust, two alternative architectural paradigms are explored, namely (1) the integration of DTUs as a component logically placed on the memory bus and (2) the integration of a DTU directly into the memory controller. Doing so places data transformation as close as possible to the memory cells to exploit their inherent parallelism while supporting unmodified commercial memory modules. The third thrust explores which programming models can best empower application designers to use DTUs via a combination of instruction-set architecture extensions, operating system-level support, and user-space libraries. Finally, the fourth thrust aims at identifying widely adopted data processing pipelines that can greatly benefit from using DTUs, specifically focusing on relational databases and machine learning. These will be used to concretely showcase the potential of the proposed on-the-fly data transformation approach.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.