Award Abstract # 1836650
S2I2: Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP)

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: THE TRUSTEES OF PRINCETON UNIVERSITY
Initial Amendment Date: August 31, 2018
Latest Amendment Date: July 25, 2022
Award Number: 1836650
Award Instrument: Cooperative Agreement
Program Manager: Bogdan Mihaila
bmihaila@nsf.gov
 (703)292-8235
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2018
End Date: August 31, 2024 (Estimated)
Total Intended Award Amount: $25,000,000.00
Total Awarded Amount to Date: $25,220,000.00
Funds Obligated to Date: FY 2018 = $12,500,000.00
FY 2019 = $2,500,000.00

FY 2020 = $2,500,000.00

FY 2021 = $5,120,000.00

FY 2022 = $2,600,000.00
History of Investigator:
  • G J Peter Elmer (Principal Investigator)
  • Gordon Watts (Co-Principal Investigator)
  • Brian Bockelman (Co-Principal Investigator)
  • Brian Bockelman (Former Co-Principal Investigator)
Recipient Sponsored Research Office: Princeton University
1 NASSAU HALL
PRINCETON
NJ  US  08544-2001
(609)258-3090
Sponsor Congressional District: 12
Primary Place of Performance: Princeton University
NJ  US  08544-2020
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): NJ1YPQXQG7U5
Parent UEI:
NSF Program(s): Software Institutes,
OFFICE OF MULTIDISCIPLINARY AC,
CYBERINFRASTRUCTURE,
COMPUTATIONAL PHYSICS,
PHYSICS AT THE INFO FRONTIER
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
01001819DB NSF RESEARCH & RELATED ACTIVIT

01001920DB NSF RESEARCH & RELATED ACTIVIT

01002122DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT

01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 8211, 026Z, 7569, 062Z
Program Element Code(s): 800400, 125300, 723100, 724400, 755300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.049, 47.070

ABSTRACT

The quest to understand the fundamental building blocks of nature and their interactions is one of the oldest and most ambitious of human scientific endeavors. In Elementary Particles Physics, the most successful theory to date is known as the "Standard Model" of particle physics. Facilities such as CERN's Large Hadron Collider (LHC) represent a huge step forward in this quest as evidenced by the discovery of the Higgs boson. The next phase of this global scientific project will be the High-Luminosity LHC (HL-LHC), which will collect data starting circa 2026 and continue into the 2030's. The primary science goal at the HL-LHC is to search for physics beyond the Standard Model. In the HL-LHC era, the ATLAS and CMS experiments will record 10 times as much data from 100 times as many collisions as were used to discover the Higgs boson. As such, significant R&D advances must be achieved in the software for acquiring, managing, processing and analyzing HL-LHC data to realize the scientific potential of the upgraded accelerator and detectors and the planned physics program. In this context, the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP) will play a leading role to meet the software and computing challenges of the HL-LHC.

The Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP) addresses key elements of the international "Roadmap for HEP Software and Computing R&D for the 2020s" and implements the "Strategic Plan for a Scientific Software Innovation Institute (S2I2) for High Energy Physics" submitted to the NSF in December 2017. IRIS-HEP will advance R&D in three high-impact areas: (1) development of innovative algorithms for data reconstruction and triggering; (2) development of highly performant analysis systems that reduce `time-to-insight' and maximize the HL-LHC physics potential; and (3) development of data organization, management and access (DOMA) systems for the community's upcoming Exabyte era. IRIS-HEP will sustain investments in today's distributed high-throughput computing (DHTC) and build an integration path to deliver its R&D activities into the distributed production infrastructure. As an intellectual hub, IRIS-HEP will lead efforts to (1) build convergence research between HEP and the Cyberinfrastructure, Data Science and Computer Science communities for novel approaches to address the compelling software and computing challenges of HL-LHC era HEP experiments, (2) engage broadly with researchers and students from U.S. Universities and labs emphasizing professional development and training, and (3) sustain HEP software and underlying knowledge related to the algorithms and their implementations over the two decades required. In addition to enabling the best possible HL-LHC science, IRIS-HEP will bring together the larger Cyberinfrastructure and HEP communities to address the complex issues at the intersection of Exascale high-throughput computing and Exabyte-scale datasets in ways broadly relevant to many research domains with emerging data-intensive needs. The education and training provided by the Institute in the form of summer schools and a fellows program will contribute to a highly qualified STEM workforce as most students and even most post-docs move into the private sector taking their skills with them.

This project advances the objectives of the National Strategic Computing Initiative (NSCI) and the objectives of "Harnessing the Data Revolution", one of the 10 Big Ideas for Future NSF Investments.

This project is supported by the Office of Advanced Infrastructure in the Directorate for Computer and Information Science and Engineering and the Division of Physics in the Directorate for Mathematical and Physical Sciences.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 57)
Malik, Sudhir and Meehan, Samuel and Lieret, Kilian and Oan Evans, Meirin and Villanueva, Michel H. and Katz, Daniel S. and Stewart, Graeme A. and Elmer, Peter and Aziz, Sizar and Bellis, Matthew and Bianchi, Riccardo Maria and Bianco, Gianluca and Bonill "Software Training in HEP" Computing and Software for Big Science , v.5 , 2021 https://doi.org/10.1007/s41781-021-00069-9 Citation Details
Lantz, S. and McDermott, K. and Reid, M. and Riley, D. and Wittich, P. and Berkman, S. and Cerati, G. and Kortelainen, M. and Hall, A. Reinsvold and Elmer, P. and Wang, B. and Giannini, L. and Krutelyov, V. and Masciovecchio, M. and Tadel, M. and Würthwei "Speeding up particle track reconstruction using a parallel Kalman filter algorithm" Journal of Instrumentation , v.15 , 2020 https://doi.org/10.1088/1748-0221/15/09/P09030 Citation Details
Krupa, Jeffrey and Lin, Kelvin and Acosta Flechas, Maria and Dinsmore, Jack and Duarte, Javier and Harris, Philip and Hauck, Scott and Holzman, Burt and Hsu, Shih-Chieh and Klijnsma, Thomas and Liu, Mia and Pedro, Kevin and Rankin, Dylan and Suaysom, Natc "GPU coprocessors as a service for deep learning inference in high energy physics" Machine Learning: Science and Technology , v.2 , 2021 https://doi.org/10.1088/2632-2153/abec21 Citation Details
Kasieczka, Gregor and Plehn, Tilman and Butter, Anja and Cranmer, Kyle and Debnath, Dipsikha and Dillon, Barry M. and Fairbairn, Malcolm and Faroughy, Darius A. and Fedorko, Wojtek and Gay, Christophe and Gouskos, Loukas and Kamenik, Jernej Fesel and Komi "The Machine Learning landscape of top taggers" SciPost Physics , v.7 , 2019 10.21468/SciPostPhys.7.1.014 Citation Details
Kanwar, Gurtej and Albergo, Michael S. and Boyda, Denis and Cranmer, Kyle and Hackett, Daniel C. and Racanière, Sébastien and Rezende, Danilo Jimenez and Shanahan, Phiala E. "Equivariant Flow-Based Sampling for Lattice Gauge Theory" Physical Review Letters , v.125 , 2020 https://doi.org/10.1103/PhysRevLett.125.121601 Citation Details
Ju, Xiangyang and Murnane, Daniel and Calafiura, Paolo and Choma, Nicholas and Conlon, Sean and Farrell, Steven and Xu, Yaoyuan and Spiropulu, Maria and Vlimant, Jean-Roch and Aurisano, Adam and Hewes, Jeremy and Cerati, Giuseppe and Gray, Lindsey and Kli "Performance of a geometric deep learning pipeline for HL-LHC particle tracking" The European Physical Journal C , v.81 , 2021 https://doi.org/10.1140/epjc/s10052-021-09675-8 Citation Details
Wurthwein, Frank and Guiang, Jonathan and Arora, Aashay and Davila, Diego and Graham, John and Mishin, Dima and Hutton, Thomas and Sfiligoi, Igor and Newman, Harvey and Balcas, Justas and Lehman, Tom and Yang, Xi and Guok, Chin "Managed Network Services for Exascale Data Movement Across Large Global Scientific Collaborations" Proceedings of the 2022 4th Annual Workshop on Extreme-scale Experiment-in-the-Loop Computing (XLOOP) , 2022 https://doi.org/10.1109/XLOOP56614.2022.00008 Citation Details
Huh, C and Proffitt, M and Prosper, H B and Sekmen, S and Sen, B and Unel, G and Watts, G "Declarative interfaces for HEP data analysis: FuncADL and ADL/CutLang" Journal of Physics: Conference Series , v.2438 , 2023 https://doi.org/10.1088/1742-6596/2438/1/012075 Citation Details
Heinrich, Lukas and Feickert, Matthew and Stark, Giordon and Cranmer, Kyle "pyhf: pure-Python implementation of HistFactory statistical models" Journal of Open Source Software , v.6 , 2021 https://doi.org/10.21105/joss.02823 Citation Details
Han, Ruize and Sim, Alex and Wu, Kesheng and Monga, Inder and Guok, Chin and Würthwein, Frank and Davila, Diego and Balcas, Justas and Newman, Harvey "Access Trends of In-network Cache for Scientific Data" SNTA '22: Fifth International Workshop on Systems and Network Telemetry and Analytics , 2022 https://doi.org/10.1145/3526064.3534110 Citation Details
Guiang, Jonathan and Arora, Aashay and Davila, Diego and Graham, John and Mishin, Dima and Sfiligoi, Igor and Wuerthwein, Frank and Lehman, Tom and Yang, Xi and Guok, Chin and Newman, Harvey and Balcas, Justas and Hutton, Thomas "Integrating End-to-End Exascale SDN into the LHC Data Distribution Cyberinfrastructure" Proceedings of PEARC '22: Practice and Experience in Advanced Research Computing , 2022 https://doi.org/10.1145/3491418.3535134 Citation Details
(Showing: 1 - 10 of 57)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Advanced software is critical to the advancement of particle physics - and transformative improvements are required to unlock the potential of current and future experiments.  The IRIS-HEP software institute established itself as a key driver of software R&D to meet the challenges of the High Luminosity Large Hadron Collider (HL-LHC).  IRIS-HEP succeeded in its broad effort to execute planned R&D from a community roadmap for the LHC analysis ecosystem, data movement, and computing services. Through cross-disciplinary collaborations between physicists and computer scientists, IRIS-HEP changed the way the HEP community approaches large scale development of research software.

Over the course of this award, IRIS-HEP has advanced the state of the art in a number of areas:

Charged Particle Tracking Reconstruction - the institute has driven key tracking software developments for use in the ATLAS, CMS and LHCb experiments.  “Tracking” allows the experiments to trace a particle’s path through a detector, and is at the heart of the most complex computational problems for the experiments. Improvements have been deployed at LHC Run 3 to validate early concepts and others have been developed for use with the tracking systems of HL-LHC.

Analysis Systems and Facilities - the institute has built an ecosystem of data analysis and data science tools for particle physics on top of the open-source scientific python ecosystem.  The Institute’s leadership and technologies unlocked the community’s ability to leverage the open-source ecosystem. This work is foundational to further enable the use of AI techniques for particle physics data analysis. With its Scalable Systems Laboratory, IRIS-HEP has also initiated an evolving community discussion on how to build analysis facilities on top of those tools and other services scale to the HL-LHC data scale and planned science program. In order to test these systems and engage the LHC experiments and community in concrete demonstrators towards those goals, IRIS-HEP has defined the Analysis Grand Challenge, a formal set of benchmarks and challenge activities that emulate HL-LHC physics analysis tasks.

Via its Data Organization, Management and Access (DOMA) work, IRIS-HEP started preparing the community for the scale and complexity of HL-LHC data.  The DOMA activity has been key to driving the migration of the LHC community from proprietary and niche protocols for wide-area data movement to HTTP which allows the community to tap into the larger global ecosystem; hundreds of petabytes a year are now moved by the LHC experiments using HTTP.  DOMA developed the ServiceX event delivery service and researched other mechanisms to efficiently deliver “columns” of data at scale to analysis facilities.  Through the Data Grand Challenge activity, DOMA helped conceptualize and execute Data Challenge 2021, a global effort toward showing readiness for the HL-LHC’s scale.

The Institute has also made a significant impact by providing a structured path for the high energy physics community to develop software skills. It has been a prime contributor to a library of training material as part of the HEP Software Foundation and has run numerous training events on basic software skills needed for modern science. In order to build a larger pool of researchers contributing to the software ecosystem, it has provided more advanced training through the annual CoDaS-HEP summer school and, through the IRIS-HEP Fellows program, mentorship of more than 100 students as they made their first steps to contribute as software developers to community software.

OSG-LHC - the OSG-LHC team provided support for the OSG fabric of Distributed High Throughput Computing production services needed by the community during LHC Run 3 and built an evolutionary path for the community to the HL-LHC.  The team successfully managed the transition from the no-longer-maintained Globus Toolkit to replacement services, leveraging the NSF’s investments in the HTCondor Software Suite.  The OSG-LHC team was able to successfully migrate the community to a newer, modern authorization technology (“tokens”) based on a capability model for security for its services.

As an Intellectual Hub within the high energy physics community, IRIS-HEP has led and catalyzed community planning through more than two dozen “blueprint” meetings and community workshops. An external governance structure, the IRIS-HEP Steering Board, was established and provides stakeholder input on the priorities, execution and strategy of the Institute.

The work of the IRIS-HEP software institute continues as part of a renewal award, (PHY-2323298) with an evolving emphasis on the paths to scaling and deployment of all software systems for use to deliver HL-LHC science.



 


Last Modified: 02/20/2025
Modified by: G J Peter Elmer

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page