Award Abstract # 2202859
SPX: Collaborative Research: NG4S: A Next-generation Geo-distributed Scalable Stateful Stream Processing System

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: VIRGINIA POLYTECHNIC INSTITUTE & STATE UNIVERSITY
Initial Amendment Date: December 13, 2021
Latest Amendment Date: December 13, 2021
Award Number: 2202859
Award Instrument: Standard Grant
Program Manager: Marilyn McClure
mmcclure@nsf.gov
 (703)292-5197
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: February 1, 2022
End Date: May 31, 2023 (Estimated)
Total Intended Award Amount: $299,266.00
Total Awarded Amount to Date: $195,938.00
Funds Obligated to Date: FY 2019 = $179,938.00
FY 2020 = $16,000.00
History of Investigator:
  • Liting Hu (Principal Investigator)
    liting@ucsc.edu
Recipient Sponsored Research Office: Virginia Polytechnic Institute and State University
300 TURNER ST NW
BLACKSBURG
VA  US  24060-3359
(540)231-5281
Sponsor Congressional District: 09
Primary Place of Performance: Virginia Polytechnic Institute and State University
300 Turner Street NW - Suite 4200
Blacksburg
VA  US  24061-0001
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): QDE5UHE5XD16
Parent UEI: X6KEFGLHSJX7
NSF Program(s): PPoSS-PP of Scalable Systems,
Special Projects - CNS
Primary Program Source: 01001920DB NSF RESEARCH & RELATED ACTIVIT
01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 026Z, 9102, 9251
Program Element Code(s): 042Y00, 171400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Our society increasingly relies on applications that process streaming data across geo-distributed sites, such as making business decisions from marketing data, identifying spam campaigns in social network streams, and analyzing genome datasets in different labs and countries to track the sources of potential epidemics. State-of-art solutions for these needs are centered around stateless stream processing. This project advances stream processing to enable next-generation streaming applications to store and update state along with computation, therefore processing live data streams in a timely fashion from massive and geo-distributed datasets. Existing systems are mainly designed for stateless stream processing in intra-datacenter settings and do not scale well for running stream applications that contain large distributed states. This project breaks the traditional abstractions of a centralized architecture and hashtable-based stateless operators, redefining them with a new decentralized architecture and new memory-efficient stateful operators, which enables novel approaches to improve overall system performance and scalability.

This project builds a next-generation geo-distributed scalable stateful stream processing system that will significantly improve the scalability of stream processing systems. This work includes three primary research directions. (1) At the architecture level, a new decentralized 'many masters/many workers' architecture will be proposed, which provides each master with maximum independence. (2) At the operator level, a new in-memory data structure will be designed and implemented to store application state and minimize the memory overhead so as to handle 'big data' requirements. (3) A new shard-based parallel recovery mechanism will be proposed to handle failures and stragglers in a scalable way. All three parts of the project will be prototyped and implemented on a widely adopted stream processing system (Apache Storm).

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Ching, Cheng-Wei and Gupta, Chirag and Huang, Zi and Hu, Liting "OrcoDCS: An IoT-Edge Orchestrated Online Deep Compressed Sensing Framework" 2023 IEEE 43nd International Conference on Distributed Computing Systems Workshop (ICDCSW) on Edge-to-Cloud AI Orchestration (ECAI) , 2023 Citation Details
Liu, Mingzhe and Liu, Haikun and Ye, Chencheng and Liao, Xiaofei and Jin, Hai and Zhang, Yu and Zheng, Ran and Hu, Liting "Towards low-latency I/O services for mixed workloads using ultra-low latency SSDs" Proceedings of the 36th ACM International Conference on Supercomputing (ICS'22) , 2022 https://doi.org/10.1145/3524059.3532378 Citation Details
Liu, Pinchao and Silva, Dilma Da and Hu, Liting. "DART: A Scalable and Adaptive Edge Stream Processing Engine" 2021 USENIX Annual Technical Conference (USENIX ATC 21) , 2021 Citation Details
Liu, Pinchao and Xu, Hailu and Da Silva, Dilma and Wang, Qingyang and Ahmed, Sarker Tanzir and Hu, Liting "FP4S: Fragment-based Parallel State Recovery for Stateful Stream Applications" 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) , 2020 https://doi.org/10.1109/IPDPS47924.2020.00116 Citation Details
Xu, Hailu and Lin, Pei-Hung and Emani, Murali and Hu, Liting and Liao, Chunhua "XUnified: A Framework for Guiding Optimal Use of GPU Unified Memory" IEEE Access , v.10 , 2022 https://doi.org/10.1109/ACCESS.2022.3196008 Citation Details
Xu, Hailu and Liu, Pinchao and Ahmed, Sarker Tanzir and Da Silva, Dilma and Hu, Liting "Adaptive Fragment-based Parallel State Recovery for Stream Processing Systems" IEEE Transactions on Parallel and Distributed Systems , 2023 https://doi.org/10.1109/TPDS.2023.3251997 Citation Details
Xu, Hailu and Liu, Pinchao and Cruz-Diaz, Susana and Silva, Dilma Da and Hu, Liting "SR3: Customizable Recovery for Stateful Stream Processing Systems" Proceedings of the 21st International Middleware Conference , 2020 https://doi.org/10.1145/3423211.3425681 Citation Details
Xu, Hailu and Liu, Pinchao and Guan, Boyuan and Wang, Qingyang and Da Silva, Dilma and Hu, Liting "Achieving Online and Scalable Information Integrity by Harnessing Social Spam Correlations" IEEE Access , v.11 , 2023 https://doi.org/10.1109/ACCESS.2023.3236604 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Today, large-scale cloud organizations are deploying data centers and edge clusters globally to provide their users low-latency access to their services. Running stream applications across these geo-distributed sites is emerging as a daily requirement, such as making business decisions from marketing streams, identifying spam campaigns from social network streams, and analyzing existing genomes in different labs and countries to track the sources of a potential epidemic. The goal of this project is to develop a next-generation scalable and stateful stream processing system that processes live data streams in a timely fashion from massive and geo-distributed data sets.

This project spans three complementary thrusts. First, a new decentralized "many masters/many slaves" architecture is designed and implemented, which revolutionarily improves the scalability of stream processing systems. Second, a new in-memory data structure for storing large distributed states is designed and implemented, which significantly amortizes the memory overhead. Third, a new fragmented-based parallel recovery mechanism is designed and implemented, which handles failures and stragglers in a scalable way to ensure system reliability. The system is evaluated and validated on both lab-based prototypes and practical, real-world deployments (cloud testbed).

This project brings an effective and practical solution that benefits broad applications across everyday life, including stock trading, social networks, system monitoring, atmospheric sensing, market feed processing, fraud detection, and many others. The outcomes of this project can also be extended to many other systems software that require memorizing application states, including graph databases, task queues, application data caching systems, event tracking systems, NoSQL stores, and distributed databases. In addition to its technical contributions, this project involves various educational and outreach activities as well. The results of the research are integrated into the undergraduate and graduate systems courses. The toolkit, source code, datasets, and course materials developed in this project are documented and open-sourced.

 


Last Modified: 09/13/2023
Modified by: Liting Hu

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page