
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | June 30, 2020 |
Latest Amendment Date: | June 30, 2020 |
Award Number: | 2019073 |
Award Instrument: | Standard Grant |
Program Manager: |
Deepankar Medhi
dmedhi@nsf.gov (703)292-2935 OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2020 |
End Date: | September 30, 2024 (Estimated) |
Total Intended Award Amount: | $850,000.00 |
Total Awarded Amount to Date: | $850,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
5801 S ELLIS AVE CHICAGO IL US 60637-5418 (773)702-8669 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
Chicago IL US 60637-5418 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CISE Research Resources |
Primary Program Source: |
|
Program Reference Code(s): | |
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Scientific instruments are capable of generating data at very high speeds. However, with traditional file-based data movement and analysis methods, data are often processed at a much lower speed, leading to either operating the instruments at a lower speed or discarding a (significant) portion of the data without processing it. To address this issue, SciStream project will develop software tools to stream data at very high speeds from scientific instruments to supercomputers at a distant location. SciStream hides the complexities in network connections from the end user and provides a high level of security for all the network connections.
The data producers (e.g., data acquisition applications on scientific instruments, simulations on supercomputers) and consumers (e.g., data analysis applications on high performance computing systems) may be in different security domains (and thus require bridging of those domains) and may, further, lack external network connectivity (and thus, require traffic forwarding proxies). SciStream establishes necessary bridging and end-to-end authentication between source and destination, while providing efficient memory-to-memory data streaming. Through the exploration of architectural and design choices and addressing issues of control protocols and security, SciStream will advance the understanding of the challenges in supporting high speed memory-to-memory data streaming between remote instruments in federated science environments.
SciStream will benefit all scientific applications that require memory-to-memory data streaming between distributed instruments. Recent trends suggest that this is an important and growing requirement for many scientific applications. SciStream will help significantly reduce the time to solution for these applications, resulting in improved scientific productivity and thus far-reaching benefits for society. Key design choices such as application-agnostic streaming and support for best-effort streaming will make SciStream appealing to a broader science community. SciStream will engage with domain scientists, campus computing centers, and a scientific user facility to reach a wider audience. Through on-campus programs at the University of Chicago, SciStream will train under-represented students in networking. Additional details on SciStream can be found here: https://scistream.github.io/
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Overview: The SciStream project has significantly enhanced secure, high-performance data streaming between scientific instruments across independent administrative domains. By developing an open-source toolkit, SciStream enables efficient memory-to-memory data transfer, seamlessly integrating with widely used authentication methods to ensure secure, scalable, and high-speed connectivity.
Intellectual Merit: SciStream introduced an innovative architecture for real-time data streaming, designed to overcome challenges in network security, performance optimization, and inter-facility data transfer. Key technical advancements include:
Gateway Node (GN) Architecture: Optimized for streaming data between instruments, WANs, and computing resources.
Advanced Networking: Leveraging TCP, QUIC, and high-speed eBPF-based proxies to stream data at high-speed and low-latency between federated scientific instruments.
Integration with Science DMZs: Ensuring secure, policy-compliant streaming in high-performance computing (HPC) environments.
SciStream’s capabilities were rigorously tested on FABRIC and ESnet 100G testbeds, showing minimal performance overhead compared to direct network connections.
Broader Impacts: SciStream has had a significant impact on the scientific community, facilitating real-time data processing for critical applications.
Deployment at Multiple Facilities: SciStream was successfully integrated with Argonne National Laboratory’s Polaris cluster, Clemson University’s ultrasound image processing workflow, and the Advanced Photon Source (APS)'s upstart cluster.
Workforce Development: The project mentored five students, with four publishing research on high-performance networking.
Knowledge Dissemination: Findings were shared through five peer-reviewed publications, two featured articles, and over a dozen invited talks and demonstrations at major scientific conferences.
Key Achievements:
- Developed a scalable architecture for federated scientific data streaming.
- Created an open-source toolkit supporting standard and custom proxy solutions.
- Integrated authentication methods like Globus Auth to enhance security.
- Validated SciStream on large-scale testbeds, ensuring reliability and scalability.
- Demonstrated real-time streaming applications in physics, imaging, and medical research.
By bridging gaps in scientific data streaming, SciStream accelerates discovery, fosters collaboration, and lays the foundation for next-generation federated streaming solutions.
Last Modified: 02/17/2025
Modified by: Rajkumar Kettimuthu
Please report errors in award information by writing to: awardsearch@nsf.gov.