Award Abstract # 1659403
CC* Integration: SANDIE: SDN-Assisted NDN for Data Intensive Experiments

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: NORTHEASTERN UNIVERSITY
Initial Amendment Date: June 19, 2017
Latest Amendment Date: June 19, 2017
Award Number: 1659403
Award Instrument: Standard Grant
Program Manager: Deepankar Medhi
dmedhi@nsf.gov
 (703)292-2935
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 1, 2017
End Date: September 30, 2021 (Estimated)
Total Intended Award Amount: $1,000,000.00
Total Awarded Amount to Date: $1,000,000.00
Funds Obligated to Date: FY 2017 = $1,000,000.00
History of Investigator:
  • Edmund Yeh (Principal Investigator)
    eyeh@ece.neu.edu
  • Harvey Newman (Co-Principal Investigator)
  • Christos Papadopoulos (Co-Principal Investigator)
Recipient Sponsored Research Office: Northeastern University
360 HUNTINGTON AVE
BOSTON
MA  US  02115-5005
(617)373-5600
Sponsor Congressional District: 07
Primary Place of Performance: Northeastern University
360 Huntington Avenue
Boston
MA  US  02115-5005
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): HLTMVS2JZBS6
Parent UEI:
NSF Program(s): CISE Research Resources
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):
Program Element Code(s): 289000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Advancing discovery in many scientific fields depends crucially on our ability
to extract the wealth of knowledge buried in massive datasets whose scale and complexity continue
to grow exponentially with time. In order to address this fundamental challenge, this project will
develop and deploy SANDIE, a Named Data Networking (NDN) architecture supported by advanced
Software Defined Network services for Data Intensive Science, with the Large Hadron Collider (LHC) high energy
physics program as the leading use case.

The implementation of SANDIE will leverage two state of the art testbeds: the NDN testbed hosted and serving the climate
science community at Colorado State, and the SDN testbed hosted at Caltech. Building on these facilities, and the support for SDN services
From multiple advanced Research & Education network partners,
we will deploy a set of ten high performance, relatively low cost NDN edge caches with SSDs
and 40G or 100G network interfaces at six participating sites: Caltech, Northeastern, UCSD,
University of Florida, MIT and CERN, together with an existing cache at CSU.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 35)
B. Sayedana, A. Mahajan "Cross-layer communication over fading channels with adaptive decision feedback" 2020 18th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOPT) , 2020 Citation Details
Cui, Ying and Medard, Muriel and Yeh, Edmund and Leith, Douglas and Duffy, Ken R. "Optimization-Based Linear Network Coding for General Connections of Continuous Flows" IEEE/ACM Transactions on Networking , v.26 , 2018 10.1109/TNET.2018.2865534 Citation Details
Fan, Chengyu and Shannigrahi, Susmit and Papadopoulos, Christos and Partridge, Craig "Discovering in-network Caching Policies in NDN Networks from a Measurement Perspective" ICN '20: Proceedings of the 7th ACM Conference on Information-Centric Networking , 2020 https://doi.org/10.1145/3405656.3418711 Citation Details
Ioannidis, Stratis and Yeh, Edmund "Adaptive Caching Networks With Optimality Guarantees" IEEE/ACM Transactions on Networking , v.26 , 2018 10.1109/TNET.2018.2793581 Citation Details
Ioannidis, Stratis and Yeh, Edmund "Jointly optimal routing and caching for arbitrary network topologies" Proceedings of the 4th ACM Conference on Information-Centric Networking , 2017 10.1145/3125719.3125730 Citation Details
Ioannidis, Stratis and Yeh, Edmund "Jointly Optimal Routing and Caching for Arbitrary Network Topologies" IEEE Journal on Selected Areas in Communications , 2018 10.1109/JSAC.2018.2844981 Citation Details
Iordache, Ctlin and Liu, Ran and Balcas, Justas and rivinskas, Raimondas and Wu, Yuanhao and Fan, Chengyu and Shannigrahi, Susmit and Newman, Harvey and Yeh, Edmund "Named Data Networking based File Access for XRootD" EPJ Web of Conferences , v.245 , 2020 https://doi.org/10.1051/epjconf/202024504018 Citation Details
J. Balcas, T.W.Hendricks "SDN-NGenIA, a Software Defined Next Generation Integrated Architecture for HEP and Data Intensive Science" Journal of physics. Conference series , 2017 1088/1742-6596/898/11/112001 Citation Details
Kamran, Khashayar and Yeh, Edmund and Ma, Qian "DECO: Joint Computation, Caching and Forwarding in Data-Centric Computing Networks" Mobihoc '19 Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing , 2019 10.1145/3323679.3326509 Citation Details
Liu, An and Lau, Vincent K. and Ding, Wenchao and Yeh, Edmund "Mixed-Timescale Online PHY Caching for Dual-Mode MIMO Cooperative Networks" IEEE Transactions on Wireless Communications , v.18 , 2019 10.1109/TWC.2019.2907586 Citation Details
Liu, Ran and Yeh, Edmund and Eryilmaz, Atilla "Proactive Caching for Low Access-Delay Services under Uncertain Predictions" Proceedings of the ACM on Measurement and Analysis of Computing Systems - SIGMETRICS , 2019 https://doi.org/10.1145/3309697.3331471 Citation Details
(Showing: 1 - 10 of 35)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The world's largest data- and network-intensive science programs at the forefront of making fundamental discoveries face unprecedented challenges: in global data distribution, processing, access and analysis, and in the coordinated use of massive but still limited computing, storage and network resources.  This project has made fundamental contributions in meeting these challenges, by designing and implementing a new, highly efficient system for data access and distribution for the Large Hadron Collider (LHC) high energy physics (HEP) network.  The system is based on the Named Data Networking (NDN) architecture and supported by advanced Software Defined Network (SDN) services. 

Outcomes in the intellectual merit area include (1) detailed analysis of workflow, data placement and resource distribution in the LHC Compact Muon Solenoid (CMS) network, including CMS Elasticsearch, analytics service of CMS, Phedex and DBS systems; (2) development of hierarchical NDN naming scheme for CMS data; (3) implementation of the XRootD NDN based OSS (Open Storage System) plugin, with an embedded NDN consumer (which translates file system calls into NDN Interest Packets and send them over the network with the help of an NDN forwarder) along with an NDN producer; (4) implementation, optimization, and experimentation (in a local testbed) of joint caching and forwarding algorithms with the NDN Forwarding Daemon (NFD); (5) implementation, optimization, and experimentation (in local and wide-area-network testbeds) of joint caching and forwarding algorithms with the NDN-DPDK forwarder developed by National Institute of Standards (NIST), for improved performance in throughput, delay, and cache hit rates; (6) development of new consumer and producer applications based on the NDNgo library developed at NIST, to enable NDN-based applications interfacing with the NDN-DPDK forwarder to achieve throughputs higher than 1 Gbps; (7) development of new NDN library offering a set of APIs on which developers can base their applications to communicate with a local NDN-DPDK forwarders using memif, a shared memory packet interface that provides high performance packet transmission; (8) updating of joint caching and forwarding algorithms to be compatible with new versions of NDN-DPDK forwarder based on the NDNgo library and using the memif interface; (9) implementation of NDN applications inside Docker containers, enabling deployment on different computing platforms; (10) demonstration that XRootD NDN based OSS plugin using NFD forwarder yields better performance for CMSSW jobs than existing solutions in use at CERN; (11) establishment of a SANDIE wide-area-network (WAN) testbed to support transfer rates up to 100 Gbps, with servers at Northeastern, Caltech, and Colorado State University; (12) planning, coordination, deployment and management of stable, high-performance virtual LANs (VLANs) for the SANDIE WAN testbed, in collaboration with campus network administrators, regional network operators, Internet2, STARLIGHT, ESnet, CENIC and SCinet;  (13) establishment of expanded 100 Gbps WAN testbed with nodes at Northeastern (MGHPCC), Caltech, STARLIGHT Chicago, UCLA and Tennessee Tech;  (14) first demonstration of feasibility and performance of the NDN-based SANDIE data distribution platform over a WAN at Supercomputing (SC) 2018 conference; (15) demonstration at SC 2019 conference that NDN-based SANDIE data distribution platform can deliver LHC HEP data over a transcontinental layer-2 WAN testbed at over 6.7 Gbps over a single thread, and that optimized joint caching and forwarding can decrease download times by a factor of 10.

Outcomes in the broader impact area include (1) establishment of the NDN paradigm for agile and efficient network operations in support of data-intensive sciences, including name-based data access, distribution and caching, together with the SDN-driven "consistent operations" paradigm in which science programs can use stable large flows directed along optimally chosen load-balanced paths up to high water marks compatible with other traffic; (2) acceleration of progress to the next round of data-intensive science discoveries, with initial focus on CMS and the other LHC experiments, and future application of developed methods and tools to other major science areas including future astrophysical sky surveys, genomics, bioinformatics, climate, and earth observation; (3) the design, optimization, and deployment of high-performance NDN-based protocols, algorithms, and systems in mainstream data distribution and analysis settings at regional, continental and intercontinental scales, with broad-based impact across many data-intensive science and engineering fields; (4) the establishment and maintenance of a persistent high-throughput wide area network testbed for experimentation of advanced NDN and SDN-based protocols, algorithms, and systems; (5) research and educational training for supported graduate students in data-centric networking for large-scale data-intensive science applications; (6) training of a new generation of scientists and engineers in several leading edge technology areas, including NDN as a candidate future network architecture, data intensive applications using NDN, its field deployment across the present ensemble of research and education IP networks, federated data access and analysis, and optimization of globally distributed systems. 


Last Modified: 05/18/2022
Modified by: Edmund M Yeh

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page