Award Abstract # 1314921
NeTS: Large: Collaborative Research: HCPN: Hybrid Circuit/Packet Networking

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF CALIFORNIA, SAN DIEGO
Initial Amendment Date: August 8, 2013
Latest Amendment Date: August 12, 2015
Award Number: 1314921
Award Instrument: Continuing Grant
Program Manager: John Brassil
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2013
End Date: December 31, 2016 (Estimated)
Total Intended Award Amount: $1,800,002.00
Total Awarded Amount to Date: $1,816,002.00
Funds Obligated to Date: FY 2013 = $599,999.00
FY 2014 = $616,001.00

FY 2015 = $600,002.00
History of Investigator:
  • George Porter (Principal Investigator)
    gmporter@cs.ucsd.edu
  • George Papen (Co-Principal Investigator)
  • Joseph Ford (Co-Principal Investigator)
  • Alex Snoeren (Co-Principal Investigator)
Recipient Sponsored Research Office: University of California-San Diego
9500 GILMAN DR
LA JOLLA
CA  US  92093-0021
(858)534-4896
Sponsor Congressional District: 50
Primary Place of Performance: University of California-San Diego
9500 Gilman Drive, 0934
La Jolla
CA  US  92093-0934
Primary Place of Performance
Congressional District:
50
Unique Entity Identifier (UEI): UYTTZT6G9DT1
Parent UEI:
NSF Program(s): Special Projects - CNS,
Networking Technology and Syst
Primary Program Source: 01001314DB NSF RESEARCH & RELATED ACTIVIT
01001415DB NSF RESEARCH & RELATED ACTIVIT

01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7363, 7925, 9178, 9251
Program Element Code(s): 171400, 736300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Ever-larger data centers are powering the cloud computing revolution, but the scale of these installations is currently limited by the ability to provide sufficient internal network connectivity. Delivering scalable packet-switched interconnects that can support the continually increasing data rates required between literally hundreds of thousands of servers is an extremely challenging problem that is only getting harder. This project leverages microsecond optical circuit-switch technology to develop a hybrid switching paradigm that spans the gap between traditional circuit switching and full-fledged packet switching, achieving a level of performance and scale not previously attainable. This will result in a hybrid switch whose optical switching capacity is orders of magnitude larger than the electrical packet switch, yet whose performance from an end-to-end perspective is largely indistinguishable from a giant (electrical) packet switch.

The research provides a quantitative baseline for hybrid network design across a wide range of present and future technologies. The project will consist of five parts: i) traffic characterization to identify the class of network traffic that a circuit switch can support as well as the partitioning of the traffic between the circuit and packet portions of the network; ii) circuit scheduling to enable the circuit switch to rapidly multiplex a set of circuits across a large set of data center traffic flows; iii) traffic conditioning to reduce the variability of traffic at the end hosts, easing the demands placed on switch scheduling; iv) a prototype hybrid network that can use an optical circuit switch that operates three orders of magnitude faster than existing solutions; and v) a trend analysis to understand the tradeoffs resulting from potential future technology advances.

The work stands to dramatically improve data center networks, significantly reducing operating costs and increasing energy efficiency. The research material will be incorporated into courses, helping to train the next generation of computer networking scientists and engineers. The PIs will also continue ongoing outreach to high school students, both through the UCSD COSMOS summer program and through talks delivered at local high schools.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

He Liu, Matthew K. Mukerjee, Conglong Li, Nicolas Feltman, George Papen, Stefan Savage, Srinivasan Seshan, Geoffrey M. Voelker, David G. Andersen, Michael Kaminsky, George Porter, and Alex C. Snoeren "Scheduling Techniques for Hybrid Circuit/Packet Networks" Proceedings of ACM CoNEXT , 2015
Michael Conley, Amin Vahdat, and George Porter "Achieving Cost-efficient, Data-intensive Computing in the Cloud" Proceedings of the ACM Symposium on Cloud Computing (SoCC) , 2015 , p.1
Pramod Subba Rao and George Porter "Is memory disaggregation feasible? A case study with Spark SQL" Proceedings of the ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS) , 2016
William M. Mellette, Alex C. Snoeren, and George Porter "P-FatTree: A Multi-channel Datacenter Network Topology" Proceedings of the 15th ACM Workshop on Hot Topics in Networks (HotNets-XV) , 2016 10.1145/3005745.3005746
W. M. Mellette and J. E. Ford "Scaling Limits of MEMS Beam-Steering Switches for Data Center Networks" Journal of Lightwave Technology , v.33 , 2015 , p.3308 10.1109/JLT.2015.2431231
W.M. Mellette and J.E. Ford "Scaling Limits of MEMS beam-steering switches for data center networks" IEEE Journal of Lightwave Technology , v.33 , 2015 , p.3308
W.M. Mellette, G.M. Schuster, G. Porter, G. Papen, and J.E. Ford "A Scalable, Partially Reconfigurable Optical Switch for Data Center Networks" IEEE Journal of Lightwave Technology , v.35 , 2017 , p.136
Yeshaiahu Fainman, Andrew Grieco, George Porter, and Jordan Davis "Nanophotonic Devices and Circuits for Communications" Proceedings of the Third Annual International Conference on Nanoscale Computing and Communication (ACM Nanocom'16) , 2016 10.1145/2967446.2967467

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Today's large-scale "cloud" services are incredibly power hungry, using energy sufficient to power every household in New York City twice over, and that percentage is expected to double in the next 10 years[1].  According to the NRDC, "In 2013, U.S. data centers consumed an estimated 91 billion kilowatt-hours of electricity, equivalent to the annual output of 34 large (500-megawatt) coal-fired power plants. Data center electricity consumption is projected to increase to roughly 140 billion kilowatt-hours annually by 2020, the equivalent annual output of 50 power plants, costing American businesses $13 billion annually in electricity bills and emitting nearly 100 million metric tons of carbon pollution per year.[2]"  The enormous energy demands of these datacenters limits their growth potential and results in unnecessary operational expense.  The good news is that there are many improvements that can be made in datacenters to raise their efficiency.  In some cases, commercial providers run their systems at only 10 or 20% utilization due in large part to the inability for the underlying network to scale to meet bandwidth requirements.

Fundamentally, the packet-switching technology underlying current data-center interconnects limits their ability to scale: implementing control- and data-planes necessary to forward packets individually is costly at present, and will rapidly cease to be feasible as link data rates continue to increase.

The overall goal of this project has been to leverage microsecond optical circuit-switch technology to develop a hybrid switching paradigm that spans the gap between traditional circuit switching and full-fledged packet switching, achieving a level of performance and scale not previously attainable.  We have designed and built a hybrid switch whose optical switching capacity is orders of magnitude larger than the electrical packet switch, yet whose performance from an end-to-end perspective is largely indistinguishable from a giant (electrical) packet switch.  A key aspect of this project has been demonstrating a system-level control plane for hybrid networks capable of leveraging both circuit- and packet-switching.

Our work has resulted in a characterization of commercial datacenter workloads, in particular multiple clusters at Facebook.  Our findings have been used by our project as well as other projects to design an interconnect supporting commercial workloads.  We have explored circuit switching in the context of a new, novel, non-crossbar “Selector Switch” architecture.  A Selector Switch switches entire groups of input ports between entirely disjoint network matchings, rather than switching traffic from one input port to a single output port.  We have studied the fundamental scaling limitations of MEMS-based optical switches, and found that they scale as a function of resolvable states, rather than ports.  This means that by selecting among matchings, rather than individual input/output port matchings, we can build a Selector Switch that scales to 1000s of ports with existing technology, without requiring expensive optical amplification or telecom-grade transceivers.  We have designed and build a 61-port Selector Switch prototype.

Paired with a Selector Switch-based architecture, we have demonstrated how to build a Top-of-Rack switch that supports hybrid packet/circuit networks.  The resulting prototype, REACToR, relies on a trio of mechanisms to support hybrid networks.  First, it explicitly “pulls” packets from attached servers based on foreknowledge of which circuits will be established at a given time.  This eliminates the need to maintain large packet buffers in the ToR.  Second, REACToR classifies packets based on a centralized circuit schedule, directing low-latency packets to a separate packet-switched network to reduce overall latency.  Finally, REACToR plays a key role in ensuring datacenter-wide synchronization.  We have designed and build a 16-port FPGA-based REACToR prototype.

We have designed two centralized circuit-switch scheduling algorithms, Solstice and Albedo.  Solstice operates by processing an input demand matrix, representing a snapshot of the overall datacenter workload demand.  The Solstice algorithm is a computationally tractable heuristic for maximizing the bandwidth through the network, operating closely to non-tractable optimal solutions.  We next designed Albedo, which extends the assumptions of Solstice by assuming that traffic could be indirected, or sent through intermediate nodes.  By allowing some traffic to travel over what would otherwise be unused paths, Albedo is able to increase the total amount of traffic sent through the network as compared to Solstice.  We have implemented both Solstice and Albedo in software.

The net result of this project is a new strategy for supporting high bandwidth networks in the future, sidestepping the impending limitations of all-packet-switched networks.  By switching some datacenter traffic optically, future datacenters will be able to support higher bandwidths, which will result in faster computations, fewer servers waiting for the data they need (and thus less server energy usage), and higher overall efficiency.  The result of higher datacenter efficiency is a dramatic reduction in overall US energy usage (since datacenters account for 2.5% of US energy usage), and the removal of a barrier that companies encounter today, namely growing their networks to meet their customer demands.

[1] https://energy.gov/eere/buildings/data-centers-and-servers

[2] https://www.nrdc.org/resources/americas-data-centers-consuming-and-wasting-growing-amounts-energy


Last Modified: 04/23/2017
Modified by: George Porter

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page