
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | May 7, 2015 |
Latest Amendment Date: | May 8, 2018 |
Award Number: | 1450377 |
Award Instrument: | Continuing Grant |
Program Manager: |
Bogdan Mihaila
bmihaila@nsf.gov (703)292-8235 OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | May 1, 2015 |
End Date: | April 30, 2020 (Estimated) |
Total Intended Award Amount: | $1,145,564.00 |
Total Awarded Amount to Date: | $1,145,564.00 |
Funds Obligated to Date: |
FY 2016 = $65,000.00 FY 2017 = $65,000.00 FY 2018 = $65,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
1 NASSAU HALL PRINCETON NJ US 08544-2001 (609)258-3090 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
Jadwin Hall Princeton NJ US 08544-2020 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
OFFICE OF MULTIDISCIPLINARY AC, COMPUTATIONAL PHYSICS, Software Institutes |
Primary Program Source: |
01001617DB NSF RESEARCH & RELATED ACTIVIT 01001718DB NSF RESEARCH & RELATED ACTIVIT 01001819DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Advanced software plays a fundamental role for large scientific projects. The primary goal of DIANA/HEP (Data Intensive ANAlysis for High Energy Physics) is developing state-of-the-art tools for experiments that acquire, reduce, and analyze petabytes of data. Improving performance, interoperability, and collaborative tools through modifications and additions to packages broadly used by the community will allow users to more fully exploit the data being acquired at CERN's Large Hadron Collider (LHC) and other facilities. These experiments are addressing questions at the heart of physics: What are the underlying constituents of matter? And how do they interact? With the discovery of the Higgs boson in 2012, the Standard Model of particle physics is complete. It provides an excellent description of known particles and forces. However, the most interesting questions remain open: What is the dark matter which pervades the universe? Does space-time have additional symmetries or extend beyond the 3 spatial dimensions we know? What is the mechanism stabilizing the Higgs boson mass from enormous quantum corrections? The next generation of experiments will collect exabyte-scale data samples to provide answers. Analyzing this data will require new and better tools.
First, the project will provide the CPU and IO performance needed to reduce the iteration time so crucial to explore new ideas. It will develop software to effectively exploit emerging many- and multi-core hardware. It will establish infrastructure for a higher-level of collaborative analysis, building on the successful patterns used for the Higgs boson discovery and enabling a deeper communication between the theoretical community and the experimental community. DIANA?s products will sit in the ROOT framework, already used by the HEP community of more than 10000 particle and nuclear physicists. By improving interoperability with the larger scientific software ecosystem, DIANA will incorporate best practices and algorithms from other disciplines into HEP. Similarly, the project will make its computing insights, tools, and novel ideas related to collaborative analysis, standards for data preservation, and best practices for treating software as a research product available to the larger scientific community. Finally, to improve the quality of the next generation of software engineers in HEP, DIANA will host an annual workshop on analysis tools and establish a fellowship program.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Advanced software plays a fundamental role for large scientific projects ranging from the acquisition of data through to the final results of subsequent processing and analysis. It is the glue which enables large-scale collaboration; particularly for teams of researchers working together to exploit accelerators, telescopes and other large scientific instruments. Building the requisite software is technically challenging because the computing technologies (processors, storage, networks) are evolving and data volumes are increasing rapidly, requiring ever more sophisticated data analysis methods. The Data-Intensive Analysis for High Energy Physics (DIANA/HEP) project brought together a team of particle physicists and computer scientists, in collaboration with an international team of researchers, to advance the state-of-the-art for key data analysis software tools used by the particle and nuclear physics communities. The project focused on building sustainable software for these communities and in particular on improvements in computing performance, interoperability of particle physics domain software with the larger data science ecosystem and development of tools for collaborative analysis.
The DIANA/HEP project has been catalytic to opening up the scientific python ecosystem for the particle physics community. Tools such as Uproot provided key interoperability between the current ROOT-based ecosystem and the scientific Python ecosystem. The development of Awkward Array adds key performance improvements in the Python ecosystem for the non-rectilinear data typical in particle physics. In addition we have continued to make core performance contributions to the unique C++ environment provided by the ROOT software, and developed tools for the python ecosystem (e.g. Histogram and Lorentz vector libraries) needed for particle physics, but of general applicability. A general umbrella package (scikit-hep) for these and other pythonic tools for particle physics was created and has been widely adopted with the community.
An additional key outcome from the DIANA/HEP project has been the demonstration of the viability of a particle physics analysis tools ecosystem which extends smoothly into the Python ecosystem as well as concepts of "columnar data analysis" that will be used both in the Python ecosystem and the ROOT ecosystem. This demonstration was essential to the S2I2-HEP planning process and the concepts that emerged as part of that for "analysis systems" which provide very short "time to insight". These concepts, built on DIANA/HEP research outcomes, are being carried forward in the "analysis systems" focus area within the IRIS-HEP software institute funded by NSF (OAC-1836650).
Last Modified: 01/26/2021
Modified by: G.J. Peter Elmer
Please report errors in award information by writing to: awardsearch@nsf.gov.