Award Abstract # 2028818
Collaborative Research: PPoSS: Planning: Scaling Secure Serverless Computing on Heterogeneous Datacenters

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: UNIVERSITY OF WISCONSIN SYSTEM
Initial Amendment Date: August 10, 2020
Latest Amendment Date: August 10, 2020
Award Number: 2028818
Award Instrument: Standard Grant
Program Manager: Danella Zhao
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2020
End Date: September 30, 2022 (Estimated)
Total Intended Award Amount: $89,832.00
Total Awarded Amount to Date: $89,832.00
Funds Obligated to Date: FY 2020 = $89,832.00
History of Investigator:
  • Michael Swift (Principal Investigator)
    swift@cs.wisc.edu
  • Xiangyao Yu (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Wisconsin-Madison
21 N PARK ST STE 6301
MADISON
WI  US  53715-1218
(608)262-3822
Sponsor Congressional District: 02
Primary Place of Performance: University of Wisconsin-Madison
1210 West Dayton Street
Madison
WI  US  53706-1613
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): LCLSJAGTNZQ7
Parent UEI:
NSF Program(s): PPoSS-PP of Scalable Systems
Primary Program Source: 01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 026Z
Program Element Code(s): 042Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Cloud computing has been a dominant computing paradigm that enables many important computing capabilities including large-scale (big) data processing, artificial intelligence, and scientific discoveries. A recent evolution of cloud computing includes the move to serverless computing, which simplifies the deployment of computation while enabling better scaling and higher resource utilization. Meanwhile, datacenters, the backbone of cloud computing, increasingly include heterogeneous compute and memory resources. The move toward serverless computing and heterogeneous architecture of datacenters produces a gap that unless addressed, results in inefficient use of resources. The project seeks to address this gap in order to enable new applications and new functionalities to be provided in the cloud, at lower cost and higher security, providing platforms for the advancement of science, engineering, and commerce.

Future datacenters will consist of heterogeneous compute and memory. Applications in the cloud are increasingly varied in their requirements, such as degree and granularity of parallelism; memory latency, capacity, and bandwidth requirements; and security and privacy requirements. This project investigates serverless computing as a promising programming model for heterogeneous platforms. Serverless platforms decouple system management from application execution: applications provide functions that manipulate data, and leave it to the platform to determine when the function should run, with what input data, and on what physical machine. Current platforms, such as AWS Lambda, Google Compute Functions or Azure Functions do not fully implement this vision, as they do not expose heterogeneous resources nor manage all resources automatically. This project explores novel abstractions for compute that extend serverless functions to better leverage unique hardware characteristics, and for memory to allow more automated leveraging of workload characteristics such as locality and compute intensity.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Shanbhag, Anil and Yogotama, Bobbi and Yu, Xiangyao and Madden, Samuel "Tile-based Lightweight Integer Compression in GPU" Proceedings of the 2022 International Conference on Management of Data (SIGMOD 22) , 2022 https://doi.org/10.1145/3514221.3526132 Citation Details
Yang, Yifei and Youill, Matt and Woicik, Matthew and Liu, Yizhou and Yu, Xiangyao and Serafini, Marco and Aboulnaga, Ashraf and Stonebraker, Michael "FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS" Proceedings of the VLDB Endowment , v.14 , 2021 https://doi.org/10.14778/3476249.3476265 Citation Details
Yogatama, Bobbi W. and Gong, Weiwei and Yu, Xiangyao "Orchestrating data placement and query execution in heterogeneous CPU-GPU DBMS" Proceedings of the VLDB Endowment , v.15 , 2022 https://doi.org/10.14778/3551793.3551809 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Cloud computing is a growing computing paradigm that enables important computing capabilities including large-scale data processing. Cloud allows users to leverage heterogeneous hardware for high-performance computation and memory without the capital expenditure to acquire the hardware devices. In particular, Graphics Processing Units (GPUs) are computational devices that achieve much higher computational power through massive parallelism and have been widely used in applications including graphics, machine learning, and scientific computing. In this project, we use GPUs to accelerate data analytics applications to achieve much higher processing speed. Specifically, we conducted the following two tasks:

1. Data compression in GPU. 

A key constraint of existing GPU-based data analytics systems is the limited memory capacity in GPU devices. So that only small workloads can leverage GPU acceleration. Data compression is a powerful technique that can mitigate the capacity limitation in two ways: (1) fitting more data into GPU memory and (2) speeding up data transfer between CPU and GPU. However, compression schemes for GPU today are still limited in compression ratio and/or decompression speed. We identify two limiting factors of existing approaches. First, existing decompression solutions require multiple passes of scanning the global memory to decode layers of compression schemes, incurring significant memory traffic and hurting performance. We present the tile-based decompression model to decompress encoded data in a single pass over global memory and inline with query execution. Second, we develop an efficient implementation of bit-packing-based compression schemes and their optimization techniques in the context of GPU. Our evaluation shows that our schemes can achieve similar compression rates to the best state-of-the-art compression schemes in GPU (i.e., nvCOMP) while being 2.2x and 2.6x faster in decompression speed and query running time.

2. Hybrid CPU/GPU processing. 

Heterogeneous CPU-GPU query execution is another compelling approach to mitigate the limited GPU memory capacity and PCIe bandwidth between CPU and GPU. However, the design space of heterogeneous CPU-GPU query execution has not been fully explored. We improve state-of-the-art CPU-GPU data analytics engine by optimizing data placement and heterogeneous query execution. First, we introduce a semantic-aware fine-grained caching policy which takes into account various aspects of the workload such as query semantics, data correlation, and query frequency when determining data placement between CPU and GPU. Second, we introduce a heterogeneous query executor which can fully exploit data in both CPU and GPU and coordinate query execution at a fine granularity. We integrate both solutions in a novel hybrid CPU-GPU data analytics engine that we developed. Evaluation on the Star Schema Benchmark shows that the semantic-aware caching policy can outperform the best traditional caching policy by up to 3x.


Last Modified: 01/23/2023
Modified by: Xiangyao Yu

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page