
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | August 10, 2020 |
Latest Amendment Date: | August 10, 2020 |
Award Number: | 2028818 |
Award Instrument: | Standard Grant |
Program Manager: |
Danella Zhao
CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2020 |
End Date: | September 30, 2022 (Estimated) |
Total Intended Award Amount: | $89,832.00 |
Total Awarded Amount to Date: | $89,832.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
21 N PARK ST STE 6301 MADISON WI US 53715-1218 (608)262-3822 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
1210 West Dayton Street Madison WI US 53706-1613 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | PPoSS-PP of Scalable Systems |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Cloud computing has been a dominant computing paradigm that enables many important computing capabilities including large-scale (big) data processing, artificial intelligence, and scientific discoveries. A recent evolution of cloud computing includes the move to serverless computing, which simplifies the deployment of computation while enabling better scaling and higher resource utilization. Meanwhile, datacenters, the backbone of cloud computing, increasingly include heterogeneous compute and memory resources. The move toward serverless computing and heterogeneous architecture of datacenters produces a gap that unless addressed, results in inefficient use of resources. The project seeks to address this gap in order to enable new applications and new functionalities to be provided in the cloud, at lower cost and higher security, providing platforms for the advancement of science, engineering, and commerce.
Future datacenters will consist of heterogeneous compute and memory. Applications in the cloud are increasingly varied in their requirements, such as degree and granularity of parallelism; memory latency, capacity, and bandwidth requirements; and security and privacy requirements. This project investigates serverless computing as a promising programming model for heterogeneous platforms. Serverless platforms decouple system management from application execution: applications provide functions that manipulate data, and leave it to the platform to determine when the function should run, with what input data, and on what physical machine. Current platforms, such as AWS Lambda, Google Compute Functions or Azure Functions do not fully implement this vision, as they do not expose heterogeneous resources nor manage all resources automatically. This project explores novel abstractions for compute that extend serverless functions to better leverage unique hardware characteristics, and for memory to allow more automated leveraging of workload characteristics such as locality and compute intensity.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Cloud computing is a growing computing paradigm that enables important computing capabilities including large-scale data processing. Cloud allows users to leverage heterogeneous hardware for high-performance computation and memory without the capital expenditure to acquire the hardware devices. In particular, Graphics Processing Units (GPUs) are computational devices that achieve much higher computational power through massive parallelism and have been widely used in applications including graphics, machine learning, and scientific computing. In this project, we use GPUs to accelerate data analytics applications to achieve much higher processing speed. Specifically, we conducted the following two tasks:
1. Data compression in GPU.
A key constraint of existing GPU-based data analytics systems is the limited memory capacity in GPU devices. So that only small workloads can leverage GPU acceleration. Data compression is a powerful technique that can mitigate the capacity limitation in two ways: (1) fitting more data into GPU memory and (2) speeding up data transfer between CPU and GPU. However, compression schemes for GPU today are still limited in compression ratio and/or decompression speed. We identify two limiting factors of existing approaches. First, existing decompression solutions require multiple passes of scanning the global memory to decode layers of compression schemes, incurring significant memory traffic and hurting performance. We present the tile-based decompression model to decompress encoded data in a single pass over global memory and inline with query execution. Second, we develop an efficient implementation of bit-packing-based compression schemes and their optimization techniques in the context of GPU. Our evaluation shows that our schemes can achieve similar compression rates to the best state-of-the-art compression schemes in GPU (i.e., nvCOMP) while being 2.2x and 2.6x faster in decompression speed and query running time.
2. Hybrid CPU/GPU processing.
Heterogeneous CPU-GPU query execution is another compelling approach to mitigate the limited GPU memory capacity and PCIe bandwidth between CPU and GPU. However, the design space of heterogeneous CPU-GPU query execution has not been fully explored. We improve state-of-the-art CPU-GPU data analytics engine by optimizing data placement and heterogeneous query execution. First, we introduce a semantic-aware fine-grained caching policy which takes into account various aspects of the workload such as query semantics, data correlation, and query frequency when determining data placement between CPU and GPU. Second, we introduce a heterogeneous query executor which can fully exploit data in both CPU and GPU and coordinate query execution at a fine granularity. We integrate both solutions in a novel hybrid CPU-GPU data analytics engine that we developed. Evaluation on the Star Schema Benchmark shows that the semantic-aware caching policy can outperform the best traditional caching policy by up to 3x.
Last Modified: 01/23/2023
Modified by: Xiangyao Yu
Please report errors in award information by writing to: awardsearch@nsf.gov.