Award Abstract # 1652698
CAREER: Stateless Network Functions: Building a Better Network Through Disaggregation

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: THE REGENTS OF THE UNIVERSITY OF COLORADO
Initial Amendment Date: April 14, 2017
Latest Amendment Date: July 19, 2021
Award Number: 1652698
Award Instrument: Continuing Grant
Program Manager: Deepankar Medhi
dmedhi@nsf.gov
 (703)292-2935
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 1, 2017
End Date: June 30, 2023 (Estimated)
Total Intended Award Amount: $627,999.00
Total Awarded Amount to Date: $627,999.00
Funds Obligated to Date: FY 2017 = $143,911.00
FY 2018 = $129,014.00

FY 2019 = $133,713.00

FY 2020 = $114,716.00

FY 2021 = $106,645.00
History of Investigator:
  • Eric Keller (Principal Investigator)
    eric.keller@colorado.edu
Recipient Sponsored Research Office: University of Colorado at Boulder
3100 MARINE ST
Boulder
CO  US  80309-0001
(303)492-6221
Sponsor Congressional District: 02
Primary Place of Performance: University of Colorado at Boulder
425 UCB
Boulder
CO  US  80309-0425
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): SPVKK1RC2MZ3
Parent UEI:
NSF Program(s): Networking Technology and Syst
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
01001819DB NSF RESEARCH & RELATED ACTIVIT

01001920DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT

01002122DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045
Program Element Code(s): 736300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

To improve performance, security, and reliability, network practitioners have moved away from the principle of a stateless network and added stateful processing to devices such as internet firewalls, load balancers, and intrusion detection systems. In doing so, networks have become increasingly complex and brittle. The research objective of this proposal is to provide the foundation for a transformative network architecture based on disaggregated virtual network functions. Developing this capability will improve the performance and operation of virtualized computing systems, including compute clouds, and ultimately make US information technology capabilities more competitive.

This project will introduce the new systems and algorithms to make a disaggregated network function architecture possible, leveraging recent advances in distributed systems in low-latency data stores, and the unique properties of network processing that can be used to optimize the interface between the processing and state. Specifically, this proposal will: 1) develop the algorithmic and system underpinnings that overcome the challenges in achieving the needed performance in the face of added latency, overhead in accessing state, and concurrent execution; and 2) create novel network management capabilities that leverage disaggregated network functions to realize a network function infrastructure that is efficient and robust to load changes, component failures, and software or configuration updates.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 21)
Abranches, M and Goodarzy, S and Nazari, M and Mishra, S and Keller, E "Shimmy: Shared Memory Channels for High Performance Inter-Container Communication" USENIX Workshop on Hot Topics in Edge Computing (HotEdge) , 2019 Citation Details
Abranches, Marcelo and Keller, Eric "A Userspace Transport Stack Doesn't Have to Mean Losing Linux Processing" IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN) , 2020 Citation Details
Abranches, Marcelo and Michel, Oliver and Keller, Eric "Getting back what was lost in the era of high-speed software packet processing" , 2022 https://doi.org/10.1145/3563766.3564114 Citation Details
Abranches, Marcelo and Olson, Karl and Keller, Eric "Infinity: A Scalable Infrastructure for In-Network Applications" IFIP/IEEE International workshop on Fully-Flexible Internet Architectures and Protocols for the Next-Generation Tactile Internet (FlexNGIA) , 2021 Citation Details
Alsudais, Azzam and Huang, Zhe and Balasubramanian, Bharath and Narayanan, Shankaranarayanan Puzhavakath and Keller, Eric and Joshi, Kaustubh "NodeFinder: Scalable Search over Highly Dynamic Geo-distributed State" USENIX Workshop on Hot Topics in Cloud Computing (HotCloud) , 2018 Citation Details
Alsudais, Azzam and Narayanan, Shankaranarayanan Puzhavakath and Balasubramanian, Bharath and Huang, Zhe and Keller, Eric "StepNet: A Compositional Framework with Reduced Querying for Homing Complex Network Services" IFIP/IEEE International Symposium on Integrated Network Management (IM) , 2021 Citation Details
Azzam Alsudais, Mohammad Hashemi "FOCUS: Scalable Search Over Highly Dynamic Geo-distributed State" IEEE International Conference on Distributed Computing Systems (ICDCS) , 2019 https://doi.org/10.1109/ICDCS.2019.00210 Citation Details
Caldwell, Blake and Im, Youngbin and Goodarzy, Sepideh and Ha, Sangtae and Han, Richard and Keller, Eric and Rozner, Eric "FluidMem: Full Flexible and Fast Memory Disaggregation for the Cloud" IEEE International Conference on Distributed Computing Systems (ICDCS) , 2020 Citation Details
Cusack, Greg and Michel, Oliver and Keller, Eric "Machine Learning-Based Detection of Ransomware Using SDN" ACM Workshop on Security in Software Defined Networks & Network Function Virtualization (SDN-NFV Security) , 2018 Citation Details
Goodarzy, Sepideh and Nazari, Maziyar and Han, Richard and Keller, Eric and Rozner, Eric "Resource Management in Cloud Computing Using Machine Learning: A Survey" IEEE International Conference On Machine Learning And Applications (ICML) , 2020 Citation Details
Greg Cusack, Maziyar Nazari "Escra: Event-driven, Sub-second Container Resource Allocation" IEEE International Conference on Distributed Computing Systems , 2022 Citation Details
(Showing: 1 - 10 of 21)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The high-level goal of this project is to provide the foundation for a transformative network architecture based on disaggregated network functions. This new architecture breaks the underlying assumption that state needs to be tightly coupled to a specific device, and instead proposes that the state is maintained separately and the network functions can access that state from anywhere and at any time through a well defined interface – creating a highly flexible network.  Throughout this project, we built out the systems around this fundamental new architecture. We highlight two specific areas here, homing and container auto-scaling, but note advances in network analytics, accelerated network packet processing, and memory disaggregation.

Homing: Homing, or placement, of virtual network functions on cloud infrastructures is a crucial step in the orchestration of network services, involving complex interactions with the cloud, SDN and service controllers.  Traditionally, homing involves a laborious off-line process where Network Service Providers (NSPs) hand-craft service-specific homing heuristics, and pre-provision resources based on expected service load.   What is becoming apparent after years of experience in production, however, is that this approach leads to a rigid approach with a reliance on excessively querying controllers for their state.  We introduced StepNet and StepNet+, a novel compositional framework where homing instances of a service can be described through a declarative template that enables service-designers to create new homing requests with considerable ease, and an optimization framework which significantly reduces the load placed on resource controllers.

Container Scaling:  With the emergence of containerization of virtual network functions, the automated deployment, scaling, management, and failure handling of network function clusters become important issues.   Recent works set container CPU and memory limits by automatically scaling containers based on resource usage. However, these systems are heavy weight and run on coarse-grained time scales, resulting in poor performance when predictions are incorrect. We introduced Escra, a container orchestrator that enables fine-grained, event-based resource allocation for a single container and distributed resource allocation to manage a collection of containers. We found resource allocation can easily adapt to sub-second intervals within and across hosts, meaning operators can cost-effectively scale resources without performance penalty. We showed Escra to be effective by comparing its slack, out of memory (OOM), and throttle performance with recently proposed systems. The overhead from our central controller is minimal, while reducing the throttle rate by anywhere from 1.8x to 7.4x and the OOM rate by 100% over the state of the art container orchestrator. We achieve these low throttle and OOM rates while also reducing the 70% CPU and memory slack by over 2.5x and 2.3x, respectively.

In summary, this project successfully developed foundational techniques to enable a disaggregated architecture for network functions, and developed novel management capabilities that leverage disaggregation to create a more robust and easier to manage network infrastructure.


Last Modified: 10/30/2023
Modified by: Eric Keller

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page