Award Abstract # 1846418
CAREER: Advanced Containers for Reproducibility in Computational and Data Science

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: DEPAUL UNIVERSITY
Initial Amendment Date: May 2, 2019
Latest Amendment Date: July 14, 2024
Award Number: 1846418
Award Instrument: Continuing Grant
Program Manager: Marilyn McClure
mmcclure@nsf.gov
 (703)292-5197
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 1, 2019
End Date: March 31, 2025 (Estimated)
Total Intended Award Amount: $498,889.00
Total Awarded Amount to Date: $514,589.00
Funds Obligated to Date: FY 2019 = $95,478.00
FY 2020 = $96,964.00

FY 2021 = $101,179.00

FY 2022 = $94,476.00

FY 2023 = $0.00

FY 2024 = $15,700.00
History of Investigator:
  • Tanu Malik (Principal Investigator)
    tanu.malik@depaul.edu
Recipient Sponsored Research Office: DePaul University
1 E JACKSON BLVD
CHICAGO
IL  US  60604-2287
(312)362-7388
Sponsor Congressional District: 07
Primary Place of Performance: DePaul University
IL  US  60604-2287
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): MNZ8KMRWTDB6
Parent UEI:
NSF Program(s): CSR-Computer Systems Research
Primary Program Source: 01002425DB NSF RESEARCH & RELATED ACTIVIT
01001920DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT

01002122DB NSF RESEARCH & RELATED ACTIVIT

01002223DB NSF RESEARCH & RELATED ACTIVIT

01002324DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 9251
Program Element Code(s): 735400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Reproducibility is essential for scientific progress and to establish trust in scientific results. Published computational results, increasingly, lack sufficient capture and description of companion information that enables subsequent confirmation and extension of results. This project will design and implement a novel container-based approach for sharing and reproducing scientific results. Reproducible containers developed from this project will package code, data, environment, provenance, and assumptions across heterogeneous computing platforms. In contrast to taking a "devops"-based approach, which burdens the user to manage reproducibility of experiments, this project uses reference executions of scientific experiments as a virtualization method for containerizing associated artifacts.

While a container-based approach can help to verify repeat computations, further advancements in container technology are needed to enable advanced forms of reproducibility. This project aims to enable reproducibility even if computations include non-determinism and race conditions; code, data-sets, and parameters are changed; computations are performed on distributed platforms; and containers are shared with sensitive data and undocumented content. To that end, the project will develop an open-source container runtime that will offer primitives for enabling re-runnability, extensibility, and publish-ability of containers. The work leverages portable containers developed previously for computational sciences. This award will lay the foundation for an essential building block for establishing reproducibility of real-world computational and data science use cases. The project will increase awareness of the need for computational reproducibility tools through an integrated research and education plan involving scientists, students, and instructors.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Ahmad, Raza and Manne, Naga Nithin and Malik, Tanu "Reproducible Notebook Containers using Application Virtualization" IEEE 18th International Conference on e-Science (e-Science) , 2022 https://doi.org/10.1109/eScience55777.2022.00015 Citation Details
A. Youngdahl, D.H. Ton "SciInc: A Container Runtime for Incremental Recomputation" IEEE 15th International Conference on eScience , 2019 https://doi.org/10.1109/eScience.2019.00040 Citation Details
J. Chuah, M.Deeds "Documenting Computing Environments for Reproducible Experiments" Parallel Computing: Technology Trends , 2020 https://doi.org/10.3233/APC200106 Citation Details
Modi, Aniket and Reyad, Moaz and Malik, Tanu and Gehani, Ashish "Querying Container Provenance" WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023 , 2023 https://doi.org/10.1145/3543873.3587568 Citation Details
Naga Nithin Manne, Shilvi Satpati "CHEX: Multiversion Replay with Ordered Checkpoints." Proceedings of the Very Large Databases , v.15 , 2022 https://doi.org/10.14778/3514061.3514075 Citation Details
Nakamura, Y. Malik "Provenance-based Workflow Diagnostics Using Program Specification" 29th IEEE International Conference on High Performance Computing, Data, and Analytics , 2022 Citation Details
Nakamura, Yuta and Kanj, Iyad and Malik, Tanu "Efficient Differencing of System-level Provenance Graphs" , 2023 https://doi.org/10.1145/3583780.3615171 Citation Details
Plale, Beth A. and Malik, Tanu and Pouchard, Line C. "Reproducibility Practice in High-Performance Computing: Community Survey Results" Computing in Science & Engineering , v.23 , 2021 https://doi.org/10.1109/MCSE.2021.3096678 Citation Details
Tanu Malik, Anjo Vahldiek-Oberwagner "Expanding the Scope of Artifact Evaluation at HPC Conferences: Experience of SC21" Proceedings of Practical Reproducible Evaluation in Computer Systems , 2022 Citation Details
Y. Nakamura, T. Malik "Efficient Provenance Alignment in Reproduced Executions" USENIX Theory and Practice of Provenance , 2020 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page