
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | April 14, 2017 |
Latest Amendment Date: | July 19, 2021 |
Award Number: | 1652698 |
Award Instrument: | Continuing Grant |
Program Manager: |
Deepankar Medhi
dmedhi@nsf.gov (703)292-2935 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | July 1, 2017 |
End Date: | June 30, 2023 (Estimated) |
Total Intended Award Amount: | $627,999.00 |
Total Awarded Amount to Date: | $627,999.00 |
Funds Obligated to Date: |
FY 2018 = $129,014.00 FY 2019 = $133,713.00 FY 2020 = $114,716.00 FY 2021 = $106,645.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
3100 MARINE ST Boulder CO US 80309-0001 (303)492-6221 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
425 UCB Boulder CO US 80309-0425 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Networking Technology and Syst |
Primary Program Source: |
01001819DB NSF RESEARCH & RELATED ACTIVIT 01001920DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT 01002122DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
To improve performance, security, and reliability, network practitioners have moved away from the principle of a stateless network and added stateful processing to devices such as internet firewalls, load balancers, and intrusion detection systems. In doing so, networks have become increasingly complex and brittle. The research objective of this proposal is to provide the foundation for a transformative network architecture based on disaggregated virtual network functions. Developing this capability will improve the performance and operation of virtualized computing systems, including compute clouds, and ultimately make US information technology capabilities more competitive.
This project will introduce the new systems and algorithms to make a disaggregated network function architecture possible, leveraging recent advances in distributed systems in low-latency data stores, and the unique properties of network processing that can be used to optimize the interface between the processing and state. Specifically, this proposal will: 1) develop the algorithmic and system underpinnings that overcome the challenges in achieving the needed performance in the face of added latency, overhead in accessing state, and concurrent execution; and 2) create novel network management capabilities that leverage disaggregated network functions to realize a network function infrastructure that is efficient and robust to load changes, component failures, and software or configuration updates.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The high-level goal of this project is to provide the foundation for a transformative network architecture based on disaggregated network functions. This new architecture breaks the underlying assumption that state needs to be tightly coupled to a specific device, and instead proposes that the state is maintained separately and the network functions can access that state from anywhere and at any time through a well defined interface – creating a highly flexible network. Throughout this project, we built out the systems around this fundamental new architecture. We highlight two specific areas here, homing and container auto-scaling, but note advances in network analytics, accelerated network packet processing, and memory disaggregation.
Homing: Homing, or placement, of virtual network functions on cloud infrastructures is a crucial step in the orchestration of network services, involving complex interactions with the cloud, SDN and service controllers. Traditionally, homing involves a laborious off-line process where Network Service Providers (NSPs) hand-craft service-specific homing heuristics, and pre-provision resources based on expected service load. What is becoming apparent after years of experience in production, however, is that this approach leads to a rigid approach with a reliance on excessively querying controllers for their state. We introduced StepNet and StepNet+, a novel compositional framework where homing instances of a service can be described through a declarative template that enables service-designers to create new homing requests with considerable ease, and an optimization framework which significantly reduces the load placed on resource controllers.
Container Scaling: With the emergence of containerization of virtual network functions, the automated deployment, scaling, management, and failure handling of network function clusters become important issues. Recent works set container CPU and memory limits by automatically scaling containers based on resource usage. However, these systems are heavy weight and run on coarse-grained time scales, resulting in poor performance when predictions are incorrect. We introduced Escra, a container orchestrator that enables fine-grained, event-based resource allocation for a single container and distributed resource allocation to manage a collection of containers. We found resource allocation can easily adapt to sub-second intervals within and across hosts, meaning operators can cost-effectively scale resources without performance penalty. We showed Escra to be effective by comparing its slack, out of memory (OOM), and throttle performance with recently proposed systems. The overhead from our central controller is minimal, while reducing the throttle rate by anywhere from 1.8x to 7.4x and the OOM rate by 100% over the state of the art container orchestrator. We achieve these low throttle and OOM rates while also reducing the 70% CPU and memory slack by over 2.5x and 2.3x, respectively.
In summary, this project successfully developed foundational techniques to enable a disaggregated architecture for network functions, and developed novel management capabilities that leverage disaggregation to create a more robust and easier to manage network infrastructure.
Last Modified: 10/30/2023
Modified by: Eric Keller
Please report errors in award information by writing to: awardsearch@nsf.gov.