
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | June 19, 2017 |
Latest Amendment Date: | June 19, 2017 |
Award Number: | 1659403 |
Award Instrument: | Standard Grant |
Program Manager: |
Deepankar Medhi
dmedhi@nsf.gov (703)292-2935 OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | July 1, 2017 |
End Date: | September 30, 2021 (Estimated) |
Total Intended Award Amount: | $1,000,000.00 |
Total Awarded Amount to Date: | $1,000,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
360 HUNTINGTON AVE BOSTON MA US 02115-5005 (617)373-5600 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
360 Huntington Avenue Boston MA US 02115-5005 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CISE Research Resources |
Primary Program Source: |
|
Program Reference Code(s): | |
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Advancing discovery in many scientific fields depends crucially on our ability
to extract the wealth of knowledge buried in massive datasets whose scale and complexity continue
to grow exponentially with time. In order to address this fundamental challenge, this project will
develop and deploy SANDIE, a Named Data Networking (NDN) architecture supported by advanced
Software Defined Network services for Data Intensive Science, with the Large Hadron Collider (LHC) high energy
physics program as the leading use case.
The implementation of SANDIE will leverage two state of the art testbeds: the NDN testbed hosted and serving the climate
science community at Colorado State, and the SDN testbed hosted at Caltech. Building on these facilities, and the support for SDN services
From multiple advanced Research & Education network partners,
we will deploy a set of ten high performance, relatively low cost NDN edge caches with SSDs
and 40G or 100G network interfaces at six participating sites: Caltech, Northeastern, UCSD,
University of Florida, MIT and CERN, together with an existing cache at CSU.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The world's largest data- and network-intensive science programs at the forefront of making fundamental discoveries face unprecedented challenges: in global data distribution, processing, access and analysis, and in the coordinated use of massive but still limited computing, storage and network resources. This project has made fundamental contributions in meeting these challenges, by designing and implementing a new, highly efficient system for data access and distribution for the Large Hadron Collider (LHC) high energy physics (HEP) network. The system is based on the Named Data Networking (NDN) architecture and supported by advanced Software Defined Network (SDN) services.
Outcomes in the intellectual merit area include (1) detailed analysis of workflow, data placement and resource distribution in the LHC Compact Muon Solenoid (CMS) network, including CMS Elasticsearch, analytics service of CMS, Phedex and DBS systems; (2) development of hierarchical NDN naming scheme for CMS data; (3) implementation of the XRootD NDN based OSS (Open Storage System) plugin, with an embedded NDN consumer (which translates file system calls into NDN Interest Packets and send them over the network with the help of an NDN forwarder) along with an NDN producer; (4) implementation, optimization, and experimentation (in a local testbed) of joint caching and forwarding algorithms with the NDN Forwarding Daemon (NFD); (5) implementation, optimization, and experimentation (in local and wide-area-network testbeds) of joint caching and forwarding algorithms with the NDN-DPDK forwarder developed by National Institute of Standards (NIST), for improved performance in throughput, delay, and cache hit rates; (6) development of new consumer and producer applications based on the NDNgo library developed at NIST, to enable NDN-based applications interfacing with the NDN-DPDK forwarder to achieve throughputs higher than 1 Gbps; (7) development of new NDN library offering a set of APIs on which developers can base their applications to communicate with a local NDN-DPDK forwarders using memif, a shared memory packet interface that provides high performance packet transmission; (8) updating of joint caching and forwarding algorithms to be compatible with new versions of NDN-DPDK forwarder based on the NDNgo library and using the memif interface; (9) implementation of NDN applications inside Docker containers, enabling deployment on different computing platforms; (10) demonstration that XRootD NDN based OSS plugin using NFD forwarder yields better performance for CMSSW jobs than existing solutions in use at CERN; (11) establishment of a SANDIE wide-area-network (WAN) testbed to support transfer rates up to 100 Gbps, with servers at Northeastern, Caltech, and Colorado State University; (12) planning, coordination, deployment and management of stable, high-performance virtual LANs (VLANs) for the SANDIE WAN testbed, in collaboration with campus network administrators, regional network operators, Internet2, STARLIGHT, ESnet, CENIC and SCinet; (13) establishment of expanded 100 Gbps WAN testbed with nodes at Northeastern (MGHPCC), Caltech, STARLIGHT Chicago, UCLA and Tennessee Tech; (14) first demonstration of feasibility and performance of the NDN-based SANDIE data distribution platform over a WAN at Supercomputing (SC) 2018 conference; (15) demonstration at SC 2019 conference that NDN-based SANDIE data distribution platform can deliver LHC HEP data over a transcontinental layer-2 WAN testbed at over 6.7 Gbps over a single thread, and that optimized joint caching and forwarding can decrease download times by a factor of 10.
Outcomes in the broader impact area include (1) establishment of the NDN paradigm for agile and efficient network operations in support of data-intensive sciences, including name-based data access, distribution and caching, together with the SDN-driven "consistent operations" paradigm in which science programs can use stable large flows directed along optimally chosen load-balanced paths up to high water marks compatible with other traffic; (2) acceleration of progress to the next round of data-intensive science discoveries, with initial focus on CMS and the other LHC experiments, and future application of developed methods and tools to other major science areas including future astrophysical sky surveys, genomics, bioinformatics, climate, and earth observation; (3) the design, optimization, and deployment of high-performance NDN-based protocols, algorithms, and systems in mainstream data distribution and analysis settings at regional, continental and intercontinental scales, with broad-based impact across many data-intensive science and engineering fields; (4) the establishment and maintenance of a persistent high-throughput wide area network testbed for experimentation of advanced NDN and SDN-based protocols, algorithms, and systems; (5) research and educational training for supported graduate students in data-centric networking for large-scale data-intensive science applications; (6) training of a new generation of scientists and engineers in several leading edge technology areas, including NDN as a candidate future network architecture, data intensive applications using NDN, its field deployment across the present ensemble of research and education IP networks, federated data access and analysis, and optimization of globally distributed systems.
Last Modified: 05/18/2022
Modified by: Edmund M Yeh
Please report errors in award information by writing to: awardsearch@nsf.gov.