Award Abstract # 1840218
CICI: RDP: Open Science Chain (OSC) - A Novel Distributed Ledger-Based Framework for Protecting Integrity and Provenance of Research Data

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: UNIVERSITY OF CALIFORNIA, SAN DIEGO
Initial Amendment Date: August 17, 2018
Latest Amendment Date: August 17, 2018
Award Number: 1840218
Award Instrument: Standard Grant
Program Manager: Rob Beverly
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2018
End Date: August 31, 2022 (Estimated)
Total Intended Award Amount: $818,433.00
Total Awarded Amount to Date: $818,433.00
Funds Obligated to Date: FY 2018 = $818,433.00
History of Investigator:
  • Subhashini Sivagnanam (Principal Investigator)
    sivagnan@sdsc.edu
  • Viswanath Nandigam (Co-Principal Investigator)
Recipient Sponsored Research Office: University of California-San Diego
9500 GILMAN DR
LA JOLLA
CA  US  92093-0021
(858)534-4896
Sponsor Congressional District: 50
Primary Place of Performance: University of California-San Diego
9500 Gilman Drive
La Jolla
CA  US  92093-0934
Primary Place of Performance
Congressional District:
50
Unique Entity Identifier (UEI): UYTTZT6G9DT1
Parent UEI:
NSF Program(s): Cybersecurity Innovation
Primary Program Source: 01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):
Program Element Code(s): 802700
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Data sharing is an integral component of scientific research and associated publications. Researchers have the ability to extend and build upon prior research when they are able to efficiently access, validate, and verify the data referenced. Facilitating the future reuse of data in a secure and independently verifiable manner is critical to the advancement of research. Open Science Chain (OSC) allows a broad set of researchers to efficiently share metadata and easily verify authenticity of their scientific datasets in a secure manner, while preserving provenance and lineage information.

OSC is a web based cyberinfrastructure platform built using distributed ledger technologies that allows researchers to provide metadata and verification information about their scientific datasets and update this information as the datasets change and evolve over time in an auditable manner. The researchers are able to search, verify and validate scientific datasets and link datasets to show lineage information. OSC features a web-based portal with user-friendly interfaces for metadata registration, data search and verification capability. OSC has been designed and implemented using real world scientific datasets from a diverse set of use cases ensuring the broad applicability across scientific domains. OSC enables sharing and verification of datasets among wider research communities including large facilities, smaller labs, individual researchers and students while promoting good data documentation practices. OSC increases the confidence of the scientific results and promotes data sharing, which in turn increases productivity and promotes good science.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Nandigam, V and Lin, K and Shantharam, M and Sakai, S and Sivagnanam, S "Research Workflows - Towards reproducible science via detailed provenance tracking in Open Science Chain" PEARC 20: Practice and Experience in Advanced Research Computing , 2020 10.1145/3311790.3399619 Citation Details
Shantharam, Manu and Lin, Kai and Sakai, Scott and Sivagnanam, Subhashini "Integrity Protection for Research Artifacts using Open Science Chains Command Line Utility" PEARC '21: Practice and Experience in Advanced Research Computing , 2021 https://doi.org/10.1145/3437359.3465587 Citation Details
Shantharam, Manu and Sakai, Scott and Lin, Kai and Sivagnanam, Subhashini "Towards building a Fault Tolerant and Secure Open Science Chain" Gateways2020 , 2020 https://doi.org/10.17605/OSF.IO/MJHK8 Citation Details
Sivagnanam, Subhashini and Nandigam, Viswanath and Lin, Kai "Introducing the Open Science Chain: Protecting Integrity and Provenance of Research Data" Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines , 2019 10.1145/3332186.3332203 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project provided a unique blockchain-based solution for managing the integrity and provenance of research artifacts used in collaborative research. The Open Science Chain (OSC) project has developed a cyberinfrastructure solution using consortium blockchain technologies to store and manage the integrity and metadata provenance information of research artifacts. An easy-to-use portal and python-based command line utility were developed to lower the barrier of access to using the blockchain solution.

The OSC portal allows researchers to share the metadata provenance of research datasets, perform verification of other users' data and provide feedback to highlight potential issues with the data or metadata of published research. Researchers can link external repositories (such as GitHub or GitLab) to create a detailed workflow of their scientific experiment, linking multiple sources of data and computational code used in their published results. The entire workflow is stored in the blockchain which provides a timestamped, immutable version of the metadata provenance.  The command line utility enables automated and seamless communication between familiar working environments of the researcher and OSC, allowing the researchers to use OSC from their own laptop and other computing resources such as remote servers or HPC clusters. As a result, researchers can integrate with OSC to manage research artifacts during various stages of their computational and data workflow.

OSC spurs data reuse and helps address the issues related to data sharing and the reproducibility of published research results. Making OSC available to the research community has an impact on multiple disciples where data integrity and capturing detailed provenance are of importance.


Last Modified: 12/06/2022
Modified by: Subhashini Sivagnanam

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page