Award Abstract # 1528197
NeTS: Small: Video-Aware Network Transport + Network-Aware Video Coding

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: THE LELAND STANFORD JUNIOR UNIVERSITY
Initial Amendment Date: August 17, 2015
Latest Amendment Date: August 17, 2015
Award Number: 1528197
Award Instrument: Standard Grant
Program Manager: Darleen Fisher
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2015
End Date: August 31, 2018 (Estimated)
Total Intended Award Amount: $499,916.00
Total Awarded Amount to Date: $499,916.00
Funds Obligated to Date: FY 2015 = $499,916.00
History of Investigator:
  • Keith Winstein (Principal Investigator)
    keithw@cs.stanford.edu
Recipient Sponsored Research Office: Stanford University
450 JANE STANFORD WAY
STANFORD
CA  US  94305-2004
(650)723-2300
Sponsor Congressional District: 16
Primary Place of Performance: Stanford University
353 Serra Mall
Stanford
CA  US  94305-9025
Primary Place of Performance
Congressional District:
16
Unique Entity Identifier (UEI): HJD6G4D6TJY5
Parent UEI:
NSF Program(s): Networking Technology and Syst
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923
Program Element Code(s): 736300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This project will build new foundational technology for Internet video streaming, by spanning the research areas of video-coding and computer-networking systems. The goal is to get rid of video glitches and stalls, and to build the abstractions to enable a "World Wide Web of video," where anybody can use hyperlinks to reference, quote, excerpt, and edit educational and other videos online.

The project will develop an open-source video-streaming application, to be called Alfalfa. Alfalfa will be similar to traditional systems (e.g. Netflix, YouTube) in that it will fetch and play encoded video from a Web server via HTTP. But unlike these "adaptive-bitrate streaming" systems, Alfalfa will not have a concept of a coded video "bitrate" or "stream" at all. Instead, each frame will be a possible switching point between quality levels, and the player's job will be to plan, at runtime, the best frame-by-frame path through the video that maximizes a quality-of-experience metric. The intention is to make the video encoder as "dumb" as possible and to make most rate-control decisions at playback time.

Broader Impacts: Robust video streaming and the "World Wide Web of video" will serve as a direct multiplier for students taking and creating online video courses, especially in regions of the world with poor Internet connectivity. The project plans to collaborate with providers of online courses to test the Alfalfa technology and use it to allow student-driven online video editing in appropriate classes. In addition, the project will include a demonstration component that will stream U.S. broadcast television stations to members of the public, educating the public about video-streaming technology and enlisting participants in an effort to better understand the factors that influence the "quality of experience" of online video.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Alpernas, Kalev and Flanagan, Cormac and Fouladi, Sadjad and Ryzhyk, Leonid and Sagiv, Mooly and Schmitz, Thomas and Winstein, Keith "Secure serverless computing using dynamic information flow control" Proceedings of the ACM on Programming Languages , v.2 , 2018 10.1145/3276488 Citation Details
Fouladi, Sadjad and Romero, Francisco and Iter, Dan and Li, Qian and Chatterjee, Shuvo and Kozyrakis, Christos and Zaharia, Matei and Winstein, Keith "From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers" 2019 USENIX Annual Technical Conference (USENIX ATC 19) , 2019 Citation Details
Sadjad Fouladi, Riad S. Wahby, and Brennan Shacklett, Stanford University; Karthikeyan Vasuki Balasubramaniam, University of California, San Diego; William Zeng, Stanford University; Rahul Bhalerao, University of California, San Diego; Anirudh Sivaraman, "Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads" 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2017, juried conference) , 2017 , p.363 978-1-931971-37-9

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project created new foundational building blocks for Internet video, addressing video-coding and networking questions together to build new kinds of video systems. Overall, this project found new and better ways to marry video applications with computer networks, and kicked off a movement of "burst-parallel" serverless computing where massive tasks can be accomplished quickly, by renting tens of thousands of computer processors for a brief instant.

During the three years of this project, we worked in three major areas:

  1. Video encoding, processing, and understanding.Currently, video encoding operations on high-quality videos (e.g. 4K or VR content) are generally slower than real-time, even on a multicore computer. This means that for an hourlong video, it can take hours to experiment with a single change. We designed and built "ExCamera," an open-source system that can process videos (compressing them, transforming them, and scanning them with neural networks) many times faster than real time. The goal is to achieve the kind of interactivity and sharability for videos that systems like Google Docs have done for word-processing documents, spreadsheets, and presentations.

    To achieve this, we had to build a new kind of video encoder software, using an technique called functional programming. We were able to parallelize (break up into little pieces) the task of video encoding roughly 15x more finely than previous work, which lets ExCamera take advantage of thousands of processors at the same time to reduce the latency required to process each video.

    At a research conference (USENIX NSDI 2017), we demonstrated ExCamera's ability to scan through and edit a six-hour high-definition video with facial recognition software live on stage.

    (S. Fouladi, R. Wahby, B. Shacklett, K. Balasubramaniam, W. Zeng, R. Bhalerao, A. Sivaraman, G. Porter and K. Winstein, “Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads,” USENIX Symposium on Networked Systems Design and Implementation (NSDI ’17), Boston, Mass., 2017.)

  2. General-purpose computations using "burst-parallel" execution on thousands of tiny "serverless" functions. In 2015, providers of cloud computing, such as Amazon Web Services, started offering the ability to run pieces of code in small increments (e.g. a tenth of a second), compared with older offerings that require the customer to rent, and pay to rent, a virtual computer for a minute or more. These "serverless" offerings were intended to be used for asynchronous "microservices," where each task handles one request coming in at unpredictable times over the Internet. However, as part of this research project, we investigated whether it was also possible to rent a large number of processors at the same time in a "burst-parallel" way (e.g. 8,000 cores for 1 second each), using "serverless" computing for something closer to traditional high-performance computing or supercomputer tasks. Our ExCamera software was the first system to do this and kicked off a flourishing area of research on using "serverless" computing or "cloud functions" for a variety of massively parallel low-latency tasks.

    One of our own research papers discussed the security implications of this style of computation (K. Alpernas, C. Flanagan, S. Fouladi, L. Ryzhyk, M. Sagiv, T. Schmitz and K. Winstein, “Secure Serverless Computing Using Dynamic Information Flow Control,” ACM Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), Boston, Mass., 2018).

  3. Real-time video over unpredictable packet networks. Today's real-time video applications are built out of two separate components: a “video codec” that compresses video, and a “transport protocol” that transmits packets of data and estimates how many can be sent without overloading the network. These components are designed and built separately, often by different companies, then combined into an overall program such as Skype or FaceTime.

    In these systems, each component marches to the beat of its own drummer--or in technical language, each piece has its own "control loop." In a detailed measurement of several Internet video programs used in current practice (Skype, FaceTime, Hangouts, and the WebRTC reference implementation in Google Chrome), we found that these dueling control loops can yield suboptimal results over unpredictable networks.

    By using the functional programming style of video codec that we created in the ExCamera project, and a "video-aware" transport protocol that we created, we were able to create a new kind of video software, Salsify, that reduces delay by roughly 5x compared with the existing programs while also improving picture quality. Salsify’s main contribution is in jointly controlling the frame-by-frame control of compression and the packet-by-packet control of transmission in a single control loop. This lets the video stream track the network’s varying capacity, avoiding stalls.

    Salsify is described more fully at https://snr.stanford.edu/salsify (which includes demonstration videos) and in the technical paper: S. Fouladi, J. Emmons, E. Orbay, C. Wu, R. Wahby and K. Winstein, “Salsify: low-latency network video through tighter integration between a video codec and a transport protocol,” USENIX Symposium on Networked Systems Design and Implementation (NSDI ’18), Renton, Wash., 2018.

Last Modified: 03/06/2019
Modified by: Keith Winstein

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page