Award Abstract # 1639706
EarthCube Building Blocks: Collaborative Proposal: GeoTrust: Improving Sharing and Reproducibility of Geoscience Applications

NSF Org: RISE
Integrative and Collaborative Education and Research (ICER)
Recipient: UNIVERSITY OF MEMPHIS
Initial Amendment Date: September 16, 2016
Latest Amendment Date: September 16, 2016
Award Number: 1639706
Award Instrument: Standard Grant
Program Manager: Eva Zanzerkia
RISE
 Integrative and Collaborative Education and Research (ICER)
GEO
 Directorate for Geosciences
Start Date: September 1, 2016
End Date: August 31, 2018 (Estimated)
Total Intended Award Amount: $80,000.00
Total Awarded Amount to Date: $80,000.00
Funds Obligated to Date: FY 2016 = $80,000.00
History of Investigator:
  • Eunseo Choi (Principal Investigator)
    echoi2@memphis.edu
Recipient Sponsored Research Office: University of Memphis
115 JOHN WILDER TOWER
MEMPHIS
TN  US  38152-0001
(901)678-3251
Sponsor Congressional District: 09
Primary Place of Performance: University of Memphis
315 Administration Building
Memphis
TN  US  38152-3370
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): F2VSMAKDH8Z7
Parent UEI:
NSF Program(s): EarthCube
Primary Program Source: 01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 9150, 7433
Program Element Code(s): 807400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.050

ABSTRACT

Scientific reproducibility -- the ability to independently verify the work of other scientists -- continues to be a critical barrier towards achieving the vision of cross-disciplinary science. Federal agencies and publishers increasingly mandate and incentivize scientists to, at a minimum, establish computational reproducibility of scientific experiments. To comply scientists must connect descriptions of scientific experiments in scholarly publications with the underlying data and code used to produce the published results and findings. However, in practice, computational reproducibility is hard to achieve since it entails isolating necessary and sufficient computational artifacts and then preserving those artifacts in a standard way for later re-execution. Both isolation and preservation present challenges in large part due to the complexity of existing software and systems as well as the implicit dependencies, resource distribution, and shifting compatibility of systems that evolve over time -- all of which conspire to break the reproducibility of an experiment. The goal of the GeoTrust project is to understand the research lifecycle of scientific experiments from conception to publication and establish a framework that will improve their reproducibility.

GeoTrust will develop sandboxing-based systems and tools that help scientists effectively isolate computational artifacts associated with an experiment, use languages and semantics to preserve artifacts, and re-execute /reproduce experiments by deploying the artifacts, changing datasets, algorithms, models, environments, etc. This reproducible framework will be adopted by and integrated within community infrastructures of three geoscience sub-disciplines viz. Hydrology, Solid Earth, and Space Science. Using cross-disciplinary science uses cases from these sub-disciplines, and engaging independent evaluators, we will assess the effectiveness of the framework in achieving reproducibility of computational experiments. Finally, verified results will be associated with ?stamps of reproducibility?, establishing community recognition of computational experiments. The framework will be developed as an EarthCube capability, with software developed and released as per EarthCube requirements. Early adopters across other geoscience sub-disciplines will be continually sought.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

GeoTrust aims to make it easy for geoscientists to make their computational experiments and associated results reproducible. Reproducing numerical models have been known for long to be surprisingly and frustratingly elusive in the community of computational geoscience and computational sciences in general. This project aims to develop a reproducible research framework. It has introduced the concept of a “sciunit”, a self-contained metadata package that provides a complete description of all data elements associated with an instance (run) of a computational experiment, including input files, parameter files, the model executable and any associated libraries, and all output files (results) produced. Sciunits are essentially a single compressed file that can be shared easily and conveniently. The most attractive to end-users would be the fact that the GeoTrust framework automates most of the process of creating and sharing of sciunits.

 

Choi and the graduate research assistants (PI and GRAs hereafter) at the University of Memphis created sciunits of PyLith models. PyLith is a popular open-source finite element code (https://geodynamics.org/cig/software/pylith/) for short-term crustal dynamics developed by Computational Infrastructure for Geodynamics (CIG). PyLith is often used for computing the regional displacement and stress fields due to slip on a fault plane while it has many other applications. As of March 2016, this code has been used or cited in 48 peer-reviewed publications since 2008 and been downloaded hundreds of times all over the world. The PI and GRAs assembled computer parts to build a workstation to be dedicated to the project. They installed the command line tools of the GeoTrust framework and ran PyLith models as needed in their respective research projects that are independent of this GeoTrust project. This approach guarantees that the created sciunits are real-world cases that were not designed to be “easy” for the framework to handle. They transferred sciunits to the HydroShare web site so that they became searchable and shareable. The uploaded sciunits were verified to be runnable on the HydroShare server. The main developers of GeoTrust could use them for validation, debugging and further development of the framework.

The PI and GRAs presented the GeoTrust framework at various professional workshops for computational geophysicists and analog modelers in the period of 2017 to 2018 and at the annual American Geophysical Union Fall Meeting in 2017. In February 2019, the PIs Choi and Malik presented GeoTrust in a webinar organized by CIG. The framework has been generally perceived as a promising solution to the challenging issue of reproducing numerical models.

This project greatly helped training the graduate research assistants. They are Ph.D. students using numerical modeling as their research methodology and have been aware of the difficulty of reproducing numerical models. The project provided them with a systematic solution to this problem. They also learned how to build a computer by assembling individual parts and acquired a greater understanding and familiarity with computer hardware. This knowledge is comparable to a lab researcher’s learning how to build an experimental equipment.

 


Last Modified: 03/28/2019
Modified by: Eunseo Choi

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page