Award Abstract # 1117161
NeTS: Small: New Directions in Routing and Traffic Engineering

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: INTERNATIONAL COMPUTER SCIENCE INSTITUTE
Initial Amendment Date: August 19, 2011
Latest Amendment Date: August 19, 2011
Award Number: 1117161
Award Instrument: Standard Grant
Program Manager: Darleen Fisher
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2011
End Date: August 31, 2014 (Estimated)
Total Intended Award Amount: $300,000.00
Total Awarded Amount to Date: $300,000.00
Funds Obligated to Date: FY 2011 = $300,000.00
History of Investigator:
  • Scott Shenker (Principal Investigator)
    shenker@icsi.berkeley.edu
Recipient Sponsored Research Office: International Computer Science Institute
2150 SHATTUCK AVE
BERKELEY
CA  US  94704-1345
(510)666-2900
Sponsor Congressional District: 12
Primary Place of Performance: International Computer Science Institute
2150 SHATTUCK AVE
BERKELEY
CA  US  94704-1345
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): GSRMP1QCXU74
Parent UEI:
NSF Program(s): Networking Technology and Syst
Primary Program Source: 01001112DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923
Program Element Code(s): 736300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Routing is arguably the most fundamental aspect of networking, since it answers the basic question: how do you place the appropriate state in routers or switches so that packets can travel from source to destination? There is a huge literature on routing, covering many topics (intradomain and interdomain, wireline and wireless, convergence and policy oscillations, etc.), and the router vendors have been honing their routing implementations for many years. After all this academic and commercial work, one might expect that there would be little new of fundamental importance to say about routing, and that all current work would involve small, incremental improvements to algorithms and implementations.

However, there are two conflicting trends that are changing the context in which routing is being used, and which necessitate a new round of routing research:

Reliability requirements: Because networks are being increasingly used for critical services (hospitals, financial institutions, etc.), the reliability expectations for networks are becoming more stringent i.e., 'five nines' of reliability). Routing is responsible for directing traffic around failures (i.e., failure recovery) and avoiding hotspots (i.e., load distribution), so the required increases in reliability must come from improving the failure recovery and load distribution mechanisms embedded in routing protocols.

Network size: Networks are growing at a rapid pace, and a new class of networks - datacenters, which can have hundreds of thousands of hosts and millions of VMs - are pushing the scaling limits as never before. In all routing algorithms, because they are essentially distributed consistency algorithms, the convergence times (for responding to failures and hotspots) and/or the routing overhead (in terms of the number and size of routing messages) increase with size.

The upshot of these two developments is that routing algorithms are being asked to do a better job (in terms of reliability) on a harder task (because of the increases in network size and complexity). As a result, both the commercial world and the academic community have embarked on a new round of routing research. These efforts first produced several ad hoc rerouting methods (such as MPLS Fast Reroute and ECMP) and then concentrated on developing multipath routing methods (such as Path Splicing and a variety of other approaches). However, all of these developments are retrofitted on top of the traditional approach to routing, which builds a single path from the source to the destination. These mechanisms significantly improve the reliability of networking, but they do not tell us how to incorporate more effective failure recovery and load distribution into the core foundation of routing algorithms.

More recently, we (along with others) have proposed a new routing paradigm, one that changes the basic output of routing from a path to a directed acyclic graph (DAG). This new approach, which here will be called Routing Along DAGs (RAD), automatically provides multiple paths for local failure recovery and load distribution. This allows RAD to, without any global route recomputations in response to failures or hotspots, guarantee connectivity (as long as the graph is connected) and provide optimal load distribution (in a simple single-destination traffic model). This project is investigating the RAD approach from many angles: design, simulation, implementation, and theory. The goal is to have a put this new routing paradigm on a firm scientific footing.

Broader Impacts: There is a pressing commercial and governmental need for increased reliability and easy-to-manage routing algorithms. This proposed project will produce prototypes of new routing and traffic engineering approaches built on commercial routing hardware (using the OpenFlow interface) that could be used to test these ideas in commercial settings. This could have a significant impact on how datacenter networking is done, and more generally improve network reliability. Promising preliminary discussions have already been held with router vendors in this regard.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Koponen, T; Shenker, S; Balakrishnan, H; Feamster, N; Ganichev, I; Ghodsi, A; Godfrey, PB; McKeown, N; Parulkar, G; Raghavan, B; Rexford, J; Arianfar, S; Kuptsov, D "Architecting for Innovation" COMPUTER COMMUNICATION REVIEW , v.41 , 2011 , p.24 View record at Web of Science 10.1145/2002250.200225

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project investigated new directions for routing and traffic engineering in large-scale computer networks. One might think that routing, which is the most basic of networking tasks, would be fully understood by now. However, as in most engineered systems, the current routing solutions were designed to meet the then-current needs. In recent years, as our requirements have changed — to include greater scales (up to hundreds of thousands of nodes), more stringent reliability (requiring faster responses to failures), and complete programmability (to support SDN) — current routing solutions have been found wanting.

Our investigations explored several radically new approaches to routing, including:

  • An entirely new modularity for routing, where connectivity is the responsibility of the data plane and optimal routing is the responsibility of the control plane.
  • A recursive style of routing where a single code-base can be used to compute routes at every level of the hierarchy.
  • The first rigorous analysis of the power of failover routing, which has led to an intriguing conjecture about whether failover routing can withstand up to the min-cut number of failures.
  • Opportunities for using SDN at Interconnection points to provide more flexible interdomain routing.

These developments are not just of academic interest. They have the potential to make our networks much more reliable, scalable, and flexible, which would lead to better Internet service for all of us. Our investigations have opened up these and other avenues for future exploration, and suggest that there is much more to be done in the routing arena.


Last Modified: 12/29/2014
Modified by: Scott Shenker

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page