Award Abstract # 1541349
CC*DNI DIBBs: The Pacific Research Platform

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: UNIVERSITY OF CALIFORNIA, SAN DIEGO
Initial Amendment Date: July 30, 2015
Latest Amendment Date: January 12, 2022
Award Number: 1541349
Award Instrument: Cooperative Agreement
Program Manager: Alejandro Suarez
alsuarez@nsf.gov
 (703)292-7092
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2015
End Date: September 30, 2022 (Estimated)
Total Intended Award Amount: $5,000,000.00
Total Awarded Amount to Date: $8,197,182.00
Funds Obligated to Date: FY 2015 = $5,000,000.00
FY 2018 = $1,149,262.00

FY 2019 = $16,000.00

FY 2020 = $1,015,968.00

FY 2021 = $999,952.00

FY 2022 = $16,000.00
History of Investigator:
  • Larry Smarr (Principal Investigator)
    lsmarr@ucsd.edu
  • Philip Papadopoulos (Co-Principal Investigator)
  • Frank Wuerthwein (Co-Principal Investigator)
  • Thomas DeFanti (Co-Principal Investigator)
  • Camille Crittenden (Co-Principal Investigator)
Recipient Sponsored Research Office: University of California-San Diego
9500 GILMAN DR
LA JOLLA
CA  US  92093-0021
(858)534-4896
Sponsor Congressional District: 50
Primary Place of Performance: University of California-San Diego
La Jolla
CA  US  92093-0934
Primary Place of Performance
Congressional District:
50
Unique Entity Identifier (UEI): UYTTZT6G9DT1
Parent UEI:
NSF Program(s): CYBERINFRASTRUCTURE,
Data Cyberinfrastructure
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
01001516DB NSF RESEARCH & RELATED ACTIVIT

01001819DB NSF RESEARCH & RELATED ACTIVIT

01001920DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT

01002122DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7433, 8048, 9251
Program Element Code(s): 723100, 772600
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Research in data-intensive fields is increasingly multi-investigator and multi-institutional, depending on ever more rapid access to ultra-large heterogeneous and widely distributed datasets. The Pacific Research Platform (PRP) is a multi-institutional extensible deployment that establishes a science-driven high-capacity data-centric 'freeway system.' The PRP spans all 10 campuses of the University of California, as well as the major California private research universities, four supercomputer centers, and several universities outside California. Fifteen multi-campus data-intensive application teams act as drivers of the PRP, providing feedback to the technical design staff over the five years of the project. These application areas include particle physics, astronomy/astrophysics, earth sciences, biomedicine, and scalable multimedia, providing models for many other applications.

The PRP builds on prior NSF and Department of Energy (DOE) investments. The basic model adopted by the PRP is 'The Science DMZ,' being prototyped by the DOE ESnet. (A Science DMZ is defined as 'a portion of the network, built at or near the campus local network perimeter that is designed such that the equipment, configuration, and security policies are optimized for high-performance scientific applications rather than for general-purpose business systems'). In the last three years, NSF has funded over 100 U.S. campuses through Campus Cyberinfrastructure - Network Infrastructure and Engineering (CC-NIE) grants to aggressively upgrade their network capacity for greatly enhanced science data access, creating Science DMZs within each campus. The PRP partnership extends the NSF-funded campus Science DMZs to a regional model that allows high-speed data-intensive networking, facilitating researchers moving data between their laboratories and their collaborators' sites, supercomputer centers or data repositories, and enabling that data to traverse multiple heterogeneous networks without performance degradation over campus, regional, national, and international distances. The PRP's data sharing architecture, with end-to-end 10-40-100Gb/s connections, provides long-distance virtual co-location of data with computing resources, with enhanced security options.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 104)
Altintas I., Marcus K., Nealey I., Sellars SL, Graham J., Mishin D., Polizzi J., Crawl D., DeFanti T., and Smarr L. "Workflow-driven distributed machine learning in CHASE-CI: A cognitive hardward and software ecosystem community infrastructure." arXiv.org , 2019 , p.1903.0680
Altintas, I., Perez, I., Mishin, D., Trouillaud, A., Irving, C., Graham, J., Tatineni, M., DeFanti, T., Strande, S., Smarr, L. and Norman, M.L. "Towards a Dynamic Composability Approach for using Heterogeneous Systems in Remote Sensing." Proceedings of the IEEE 18th International Conference on e-Science , 2022
Armstrong, G., Martino, C., Morris, J., Khaleghi, B., Kang, J., DeReus, J., Zhu, Q., Roush, D., McDonald, D., Gonazlez, A. and Shaffer, J.P. "Swapping Metagenomics Preprocessing Pipeline Components Offers Speed and Sensitivity Increases." Msystems , v.7 , 2022 , p.e01378
Bharathkumar, K., Paolini, C. and Sarkar, M. "FPGA-based Edge Inferencing for Fall Detection." 2020 IEEE Global Humanitarian Technology Conference (GHTC) , 2020
Bhatta, D. and Mashayekhy, L. "A bifactor approximation algorithm for cloudlet placement in edge computing" IEEE Transactions on Parallel and Distributed Systems , v.33 , 2021 , p.1787-1798
Boada, A., Paolini, C. and Castillo, J.E. "High-order mimetic finite differences for anisotropic elliptic equations." Computers & Fluids , v.213 , 2020
Chandrasekaran, R., Ergun, K., Lee, J., Nanjunda, D., Kang, J. and Rosing, T. "Fhdnn: Communication efficient and robust federated learning for aiot networks" Proceedings of the 59th ACM/IEEE Design Automation Conference , 2022 , p.37-42
Cheng, J. and Vasconcelos, N. "Calibrating Deep Neural Networks by Pairwise Constraints" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2022 , p.13709-137
Chen, K., Beyeler, M. and Krichmar, J.L. "Cortical Motion Perception Emerges from Dimensionality Reduction with Evolved Spike-Timing-Dependent Plasticity Rules" Journal of Neuroscience , v.42 , 2022 , p.5882-5898
Chen, K., Johnson, A., Scott, E.O., Zou, X., De Jong, K.A., Nitz, D.A. and Krichmar, J.L "Differential Spatial Representations in Hippocampal CA1 and Subiculum Emerge in Evolved Spiking Neural Networks" 2021 International Joint Conference on Neural Networks (IJCNN) , 2021
Chen, K., Johnson, A., Scott, E.O., Zou, X., De Jong, K.A., Nitz, D.A. and Krichmar, J.L. "Differential Spatial Representations in Hippocampal CA1 and Subiculum Emerge in Evolved Spiking Neural Networks." 2021 International Joint Conference on Neural Networks (IJCNN) , 2021
(Showing: 1 - 10 of 104)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The Pacific Research Platform (PRP) is a distributed data-sharing hardware and software infrastructure, with end-to-end 10?100Gb/s connections, which has enabled regionwide, nationwide, and worldwide virtual co-location of data with computing. The original goal of the PRP was to interconnect campus Science DMZ network systems, as designed in 2010 by the Department of Energy's ESnet, and funded by NSF on many U.S. campuses, resulting in a many-campus regional DMZ model supporting data-intensive science. This goal was accomplished in the first two years of the grant and then expanded to enable researchers to quickly and easily move data between collaborator labs, supercomputer centers, instruments, and data repositories, creating a big-data freeway that allows the data to traverse multiple, heterogeneous networks with minimal performance degradation.

The endpoints on the campus Science DMZs are Data Transfer Nodes (DTNs), which the PRP designed as rack-mounted PC devices called Flash I/O Network Appliances (FIONAs), each of which is capable of holding up to 8 GPUs or FPGA add-in boards and up to 240TB of data. FIONAs are optimized for 10-200Gbps data transfers over Internet2 or the Quilt's regional optical Research and Education networks. The PRP team adopted the Cloud Native Computing Foundation's open-source Kubernetes container orchestration system to manage end-user or system software containers across Nautilus, the distributed hypercluster of FIONAs. This enables fast data transfer and interoperability among data stores and instruments within the PRP, while also attracting connections from institutions outside the PRP. This software/hardware/software engineering task of creating the PRP's Nautilus was even more successful than originally envisioned, largely because of the rapid rise and adoption of open-source software, such as Kubernetes, Rook, Ceph, JupyterLab, Admiralty, to name a few. The PRP team members became early adopters and contributors to these software systems. Nautilus has become a regional, national, and international model of their usefulness.

The PRP and, subsequently, the Toward the National Research Platform (TNRP) projects have been supported by the National Science Foundation (NSF) awards CNS-1730158, ACI-1540112, ACI-1541349, OAC-1826967, OAC-2112167, and CNS-2120019, and the platform is set to expand to all parts of the United States with the National Research Platform (NRP). These grants, together with hardware contributions from Nautilus users across the country, have expanded the Nautilus high-performance data-intensive cyberinfrastructure to nearly 15,000 CPU-cores, 930 GPUs, and 4 Petabytes of storage, supporting over 700 namespace computational projects from researchers in over 90 campuses, including, 16 Minority Serving Institutions (MSIs) in 39 of 50 states, plus Washington DC and Puerto Rico.

The PRP's Nautilus cluster is having broader impacts beyond just cyberinfrastructure. It is also being used directly and indirectly as part of educational experiences at PRP partner campuses, including as part of courses on cloud computing, deep learning for AI, and machine learning. In one case, the Educational Information and Technology Services office at UC San Diego cloned a 124-GPU cluster of Kubernetes-orchestrated FIONAs that are used exclusively by students to meet the computational lab requirements for formal courses and undergraduate research.  This UCSD Data Science/Machine Learning Platform (DSMLP) GPU cluster is supporting over 30 courses and 12,000 students per academic year. This approach has been replicated by San Diego State University, a Minority Serving Institution, to support machine learning and data analysis courses.


Last Modified: 11/18/2022
Modified by: Larry L Smarr

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page