Award Abstract # 1535232
SI2-SSE: EASE: Improving Research Accountability through Artifact Evaluation

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: UNIVERSITY OF PITTSBURGH - OF THE COMMONWEALTH SYSTEM OF HIGHER EDUCATION
Initial Amendment Date: June 16, 2015
Latest Amendment Date: June 16, 2015
Award Number: 1535232
Award Instrument: Standard Grant
Program Manager: Bogdan Mihaila
bmihaila@nsf.gov
 (703)292-8235
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2015
End Date: August 31, 2019 (Estimated)
Total Intended Award Amount: $499,515.00
Total Awarded Amount to Date: $499,515.00
Funds Obligated to Date: FY 2015 = $499,515.00
History of Investigator:
  • Bruce Childers (Principal Investigator)
    childers@cs.pitt.edu
  • Daniel Mosse (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Pittsburgh
4200 FIFTH AVENUE
PITTSBURGH
PA  US  15260-0001
(412)624-7400
Sponsor Congressional District: 12
Primary Place of Performance: University of Pittsburgh
123 University Club
Pittsburgh
PA  US  15213-2303
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): MKAGLD59JRL1
Parent UEI:
NSF Program(s): Special Projects - CCF,
Software Institutes
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7433, 8005
Program Element Code(s): 287800, 800400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Research in computer systems, particularly in the first stages of creating a new innovation, relies almost exclusively on software prototypes, simulators, benchmarks, and data sets to understand the benefits and costs of new ideas in computers, ranging from consumer devices to exascale systems. These artifacts are used to evaluate new capabilities, algorithms, bottlenecks and trade-offs. Empirical study is behind the rapid pace of innovation in creating faster, lower energy and more reliable systems. This experimental approach lies at the core of development that fuels the nation's information economy. Given the critical importance of experimental study to developing new computer systems, several efforts are underway to curate experimental results through accountable research. One effort, Artifact Evaluation (AE), is being adopted to promote high quality artifacts and experimentation, including making public the experimental information necessary for reproducibility. However, the rapid adoption of AE is hampered by technical challenges that create a high barrier to the process: there is no consistent or simple environment, or mechanism, to package and reproduce experiments for AE. Authors rely on their own approaches, leading to much time consumed, as well as considerable variability in the ways materials are prepared and evaluated, unnecessarily obstructing the AE process.

To overcome the technical challenges with AE, and to more broadly encourage adoption of AE in computer science and engineering research, this project is developing a software infrastructure, Experiment and Artifact System for Evaluation (EASE), to create and run experiments specifically for AE, in which authors create, conduct and share artifacts and experiments. It allows for repeating, modifying, and extending experiments. Authors may also use EASE to package and upload their experiments for archival storage in a digital library. EASE is being developed and deployed for two use cases, namely compilers and real-time systems, keeping the project tractable to address specific needs. These communities have overlapping but also distinct requirements, helping to ensure EASE can also be extended and
used by other computer systems research communities as well.

EASE will be release as open source software, based on an Experiment Management System (EMS) previously developed by the project investigator in a project call Open Curation for Computer Architecture Modeling (OCCAM), used to define and conduct experiments using computer architecture simulators. Using EMS as a starting point, EASE will provide AE support, by: 1) separating EMS from OCCAM's repository and hardware services, transforming the EMS infrastructure into EASE, a fully standalone, sustainable, and extensible platform for AE; 2) supporting record and replay (for repeating and reproducing results, as well as provenance) of artifacts and experiments as part of normal development and experimental practice to ease participation in AE by authors and evaluators; 3) supporting artifacts,workflows of artifacts and experiments that run directly on a machine, including specialized hardware and software, and run indirectly on a simulator or emulator; 4) allowing both user-level (artifacts and experiments as user processes) and system-level (artifacts and experiments involving kernel changes) innovations; 5) providing consistent/uniform access, whether locally or remotely, to artifacts and experiments; 6) simplifying viewing, running, modifying, and comparing experiments by innovators (i.e., during innovation development), artifact evaluators (during AE), and archive users (after publication); 7) enabling indexing (object locators and search tags) and packaging of artifacts and experiments for AE and for archival deployment (e.g., to ACM?s or IEEE?s Digital Library); and 8) refining, expanding, generalizing, and documenting EASE to ensure it is robust, maintainable and extensible, and that it can be used and sustained by different CSR communities (starting with real-time and compilers, given their different artifacts, data and methods).

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

David Wilkinson, Luís Oliveira, Daniel Mossé, and Bruce Childers "Software Provenance: Track the Reality Not the Virtual Machine" Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems (P-RECS'18 , 2018 https://doi.org/10.1145/3214239.3214244
Luís Oliveira, David Wilkinson, Daniel Mossé, and Bruce Childers. "Supporting Long-term Reproducible Software Execution." Proceedings of the First International Workshop on Practical Reproducible Evaluation of Computer Systems (P-RECS'18). , 2018 https://doi.org/10.1145/3214239.3214245

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Research in computer systems, particularly in the first stages of creating a new innovation, relies on simulators, benchmarks, and data sets to analyze the benefits and costs of new ideas in computers. These artifacts are used to evaluate capabilities, algorithms, bottlenecks and trade-offs through empirical study. Given the critical importance of experimental study to developing new computer systems, several efforts are underway to curate experimental results. One effort, Artifact Evaluation (AE), promotes high quality artifacts and experimentation, including making public the experimental information necessary for reproducibility. However, AE has been hampered by technical challenges that create a high barrier to the process: there are few consistent or simple mechanisms to package and reproduce experiments for AE. Authors often use their own approaches, leading to much time consumed and variability in the ways materials are prepared and evaluated.

To overcome these challenges and to encourage adoption of AE, this project developed a software infrastructure, Experiment and Artifact System for Evaluation (EASE), to create and run experiments specifically for peer review of experimental artifacts. Using this software infrastructure, authors can create, conduct and share artifacts and experiments for AE. Peer evaluators can repeat, modify, and extend experiments for AE. The EASE project levaraged and significantly built on the base OCCAM software by fully redesigning and refactoring the original system to make it standalone, sustainable and extensible for AE. The user interface was redesigned to simplify describing and running experiments by authors, and sharing them with evaluators for artifact evaluation. The interface was designed to ease the effort required by evaluators to conduct peer review of experiments. Social capabilities to comment on experiments and share the comments between authors and evaluators were examined to facilitate interactivity to simplify evaluation of experiments. Functionality to better control access to experiments, required for AE, was also incorporated. The software is open source and publicly available, with continued development and maintenance by a core set of developers.

As a result of this project, we found that the computer systems community is gradual in moving to using workflow systems, like EASE, for artifact evaluation. Although the need for standardization is well recognized and many researchers have commented that EASE and similar tools are beneficial, simple, and powerful, the community tends to "roll their own" with packaging technologies, like virtual machines (VMs) and containers, to share artifacts for AE. This may be partly due to tradition -- after all, this is the very community that developed these lower-level technologies and is most familiar with them. Advances in VM and container technologies over the last few years (driven largely by requirements for cloud computing) have made these technologies easier to use for AE. Other shortcomings of AE include the disparity in languages used with respect to the implementations and proprietary software.

Although moving to standard tools for AE is gradual, there is much interest in adopting end-to-end workflow software, like EASE, to support Artifact Evaluation. Interestingly, scientific communities that are less familiar with VM and container technologies are more quickly moving toward using workflow systems for reproducibility, reuse and supporting peer review of experiments, data and software artifacts. Indeed, through this project, we found that science communities outside of computer systems, such as world modeling and epidemiology, are eager to adopt EASE and similar tools for these purposes. 

 

 


Last Modified: 01/11/2020
Modified by: Bruce Childers

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page