Award Abstract # 2209629
Collaborative Research: Elements: TRAnsparency CErtified (TRACE): Trusting Computational Research Without Repeating It

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: CORNELL UNIVERSITY
Initial Amendment Date: July 8, 2022
Latest Amendment Date: March 25, 2025
Award Number: 2209629
Award Instrument: Standard Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 15, 2022
End Date: December 31, 2025 (Estimated)
Total Intended Award Amount: $150,000.00
Total Awarded Amount to Date: $182,000.00
Funds Obligated to Date: FY 2022 = $150,000.00
FY 2024 = $16,000.00

FY 2025 = $16,000.00
History of Investigator:
  • Lars Vilhuber (Principal Investigator)
    lars.vilhuber@cornell.edu
Recipient Sponsored Research Office: Cornell University
341 PINE TREE RD
ITHACA
NY  US  14850-2820
(607)255-5014
Sponsor Congressional District: 19
Primary Place of Performance: Cornell University
373 Pine Tree Road
Ithaca
NY  US  14850-2820
Primary Place of Performance
Congressional District:
19
Unique Entity Identifier (UEI): G56PUALJ3KT5
Parent UEI:
NSF Program(s): Data Cyberinfrastructure,
Software Institutes
Primary Program Source: 01002425DB NSF RESEARCH & RELATED ACTIVIT
01002223DB NSF RESEARCH & RELATED ACTIVIT

01002526DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 9251, 8004, 077Z, 7923
Program Element Code(s): 772600, 800400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070, 47.075

ABSTRACT

Research communities across the natural and social sciences are increasingly concerned about the transparency and reproducibility of results obtained by computational means. Calls for increased transparency can be found in the policies of peer-reviewed journals and processing pipelines employed in the creation of research data products made available through science gateways, data portals, and statistical agencies. These communities recognize that the integrity of published results and data products is uncertain when it is not possible to trace their lineage or validate their production. Verifying the transparency or reproducibility of computational artifacts?by repeating computations and comparing results?is expensive, time-consuming, and difficult, and may be infeasible if the research products rely on resources that are subject to legitimate restrictions such as the use of sensitive or proprietary data; streaming, transient, or ephemeral data; and large-scale or specialized computational resources available only to approved or authorized users. The TRACE project is addressing this problem through an approach called certified transparency - a trustworthy record of computations signed by the systems within which they were performed. Using TRACE, system owners and operators certify the original execution of a computational workflow that produces findings or data products. By using a TRACE-enabled system, researchers produce transparent computational artifacts that no longer require verification, reducing burden on journal editors and reviewers seeking to ensure reproducibility and transparency of computational results. TRACE presents an innovative and efficient approach to ensuring the transparency of research that uses computational methods, is consistent with the vision outlined by the National Academies, and enables evidence-based policymaking based on transparent and trustworthy science.

The central goal of the TRACE project is the development, validation, and implementation of a technical model of certified transparency. This includes a set of infrastructure elements that can be employed by system owners to (1) declare the dimensions of computational transparency supported by their platforms; (2) certify that a specific computational workflow was executed on the platform; and (3) bundle artifacts, records of their execution, technical metadata about their contents, and certify them for dissemination. The first phase of the project focuses on the development of a conceptual model and technical specification that can be used to certify the description of a system, termed a Transparency-Certified System (TRACE system), and the aggregation of artifacts along with records of their execution, termed Transparency-Certified Research Objects (TROs). The second phase focuses on the development of reusable software components implementing the TRACE model and approach. To demonstrate certified transparency, the toolkit is used to TRACE-enable existing platforms including Whole Tale, SKOPE, and the SLURM workload manager. These TRACE-enabled systems produce certified TROs that can be trusted and do not need to be repeated or re-executed to verify that results were obtained as claimed.

This award by the Office of Advanced Cyberinfrastructure is jointly supported by the Division of Social and Economic Sciences within the Directorate for Social, Behavioral and Economic Sciences; and by the Division of Information and Intelligent Systems within the Directorate for Computer and Information Science and Engineering.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page