
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | August 13, 2014 |
Latest Amendment Date: | August 16, 2018 |
Award Number: | 1443047 |
Award Instrument: | Continuing Grant |
Program Manager: |
Amy Walton
awalton@nsf.gov (703)292-4538 OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2014 |
End Date: | September 30, 2019 (Estimated) |
Total Intended Award Amount: | $900,000.00 |
Total Awarded Amount to Date: | $1,078,712.00 |
Funds Obligated to Date: |
FY 2015 = $150,000.00 FY 2016 = $150,000.00 FY 2018 = $178,712.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
900 S CROUSE AVE SYRACUSE NY US 13244-4407 (315)443-2807 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
Syracuse NY US 13244-1200 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
PHYSICS AT THE INFO FRONTIER, Data Cyberinfrastructure |
Primary Program Source: |
01001617DB NSF RESEARCH & RELATED ACTIVIT 01001516DB NSF RESEARCH & RELATED ACTIVIT 01001415DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Analysis and management of large data sets are vital for progress in the data-intensive realm of scientific research and education. Scientists are producing, analyzing, storing and retrieving massive amounts of data. The anticipated growth in the analysis of scientific data raises complex issues of stewardship, curation and long-term access. Scientific data is tracked and described by metadata. This award will fund the design, development, and deployment of metadata-aware workflows to enable the management of large data sets produced by scientific analysis. Scientific workflows for data analysis are used by a broad community of scientists including astronomy, biology, ecology, and physics. Making workflows metadata-aware is an important step towards making scientific results easier to share, to reuse, and to support reproducibility. This project will pilot new workflow tools using data from the Laser Interferometer Gravitational-wave Observatory (LIGO), a data-intensive project at the frontiers of astrophysics. The goal of LIGO is to use gravitational waves---ripples in the fabric of spacetime---to explore the physics of black holes and understand the nature of gravity.
Efficient methods for accessing and mining the large data sets generated by LIGO's diverse gravitational-wave searches are critical to the overall success of gravitational-wave physics and astronomy. Providing these capabilities will maximize existing NSF investments in LIGO, support new modes of collaboration within the LIGO Scientific Collaboration, and better enable scientists to explain their results to a wider community, including the critical issue of data and analysis provenance for LIGO's first detections. The interdisciplinary collaboration involved in this project brings together computational and informatics theories and methods to solve data and workflow management problems in gravitational-wave physics. The research generated from this project will make a significant contribution to the theory and methods in identification of science requirements, metadata modeling, eScience workflow management, data provenance, reproducibility, data discovery and analysis. The LIGO scientists participating in this project will ensure that the needs of the community are met. The cyberinfrastructure and data-management scientists will ensure that the software products are well-designed and that the work funded by this award is useful to a broader community.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Large-scale scientific workflows are essential to LIGO’s discoveries. To detect gravitational waves, LIGO data must be filtered through hundreds of thousands of signal models. This is repeated many times using simulated signals to measure the search’s efficiency and to diagnose and fix problems with the detectors. Searches are also run multiple times to tune the scientific parameters for maximum sensitivity. Analyses are run by teams of scientists in distributed locations and are executed using heterogeneous computing environments. LIGO was an early adopter of the Pegasus Workflow Management System (WMS) and HTCondor for its binary black hole searches. This project built on the widely-used Pegasus WMS to address the problems encountered in large-scale, distributed scientific analysis. Advanced made possible by this project allowed Pegasus to manage LIGO workflow runs that were instrumental in the first direct detection of gravitational waves from colliding black holes, and the subsequent detection of colliding neutron stars observed both in gravitational wave and visible light spectrums.
The project team collaborated to develop new data management techniques in Pegasus and improved data access for LIGO workflows. This allowed LIGO to seamlessly execute large LIGO workflows across the LIGO Data Grid and other NSF funded nation-wide computing infrastructures including the Open Science Grid (OSG) and XSEDE. We improved the Pegasus workflow monitoring dashboard for multi-user access, and improved visualization of workflow status and progress. These new capabilities proved important tools for both system administrators and the scientists running the workflows. The tools we developed provided valuable insight into the monitoring and error analysis of the workflows executed to detect gravitational waves. A new web dashboard built on top of distributed data stores enables scientists and system administrators to get a holistic, global overview of the compact analysis workflows being run, monitor computational resource use, and identify trends and errors as they occur. With DIBBS-related improvements LIGO improved the turnaround of its offline binary search analysis from many weeks to only a few days. This speed increase proved essential for LIGO Scientific Collaboration to confirm the discoveries of GW150914 and GW170817, as well as other low-latency alerts that are generated and distributed to the astronomy community.
Last Modified: 01/30/2020
Modified by: Duncan A Brown
Please report errors in award information by writing to: awardsearch@nsf.gov.