Award Abstract # 1153384
EAGER: Policy Design for Reproducibility and Data Sharing in Computational Science

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK
Initial Amendment Date: September 8, 2011
Latest Amendment Date: September 8, 2011
Award Number: 1153384
Award Instrument: Standard Grant
Program Manager: Marilyn McClure
mmcclure@nsf.gov
 (703)292-5197
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2011
End Date: December 31, 2013 (Estimated)
Total Intended Award Amount: $168,800.00
Total Awarded Amount to Date: $168,800.00
Funds Obligated to Date: FY 2011 = $168,800.00
History of Investigator:
  • Victoria Stodden (Principal Investigator)
    stodden@usc.edu
Recipient Sponsored Research Office: Columbia University
615 W 131ST ST
NEW YORK
NY  US  10027-7922
(212)854-6851
Sponsor Congressional District: 13
Primary Place of Performance: Columbia University
NY  US  10027-6902
Primary Place of Performance
Congressional District:
13
Unique Entity Identifier (UEI): F4N1QNPB95M4
Parent UEI:
NSF Program(s): CYBERINFRASTRUCTURE
Primary Program Source: 01001011DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7231, 7916
Program Element Code(s): 723100
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Scientific computation is emerging as absolutely central to the scientific method, but the prevalence of very relaxed practices regarding the communication of experimental details and the validation of results is leading to a credibility crisis. Across the computational sciences, the research community is now questioning traditional modes of communication and seeking to implement methods for scientific knowledge transfer that make replication of published computational findings possible. This typically means making all details of the computations - the data and code - underlying published computational results conveniently available to others.

Making data and code openly available raises myriad questions regarding appropriate and effective ways of reaching the goal of
reproducible research. The requirements of scientific journals exert a powerful influence on publishing decisions, and are also the least well-understood.

This proposal seeks to build an understanding of the current state of journal policy regarding reproducibility of published computational results, and of the factors underlying journal policy changes toward the adoption of data and code sharing. The highly granular nature of
computational science research provides a natural experiment for the study of effectiveness of policies in communities with different attendant pressures such as data and codebase size, privacy and legal barriers, capital intensity and level of instrumentation, and industrial collaboration, to name a few. Measures of effectiveness can be ascertained since domain-specific journals show a spectrum of positions on data and code sharing, from requiring both to ignoring the issue altogether.

These findings will in turn inform the creation of guidelines regarding effective data and code sharing policies across the computational research landscape. In addition, this research will conduct detailed case studies describing successful approaches journals have used to further reproducible research. Finally, this project will also act as a case study itself, with the open release of the data and code underlying its published results and analysis of how best to facilitate reproducibility for similarly situated research.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Victoria Stodden, Peixuan Guo, Zhaokun Ma "Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals" PLOS ONE , v.8 , 2013
Victoria Stodden, Peixuan Guo, Zhaokun Ma "Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals" PLOS ONE , 2013 10.1371/journal.pone.0067111

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project produced several outcomes. The principle outcome is a publication in the journal PLOS ONE communicating the results from this research. The article appears in the June 2013 issue and is called "Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals." The data generated by this study was made publicly available on my website and on the ResearchCompendia website, and a link was supplied within the published article. A co-edited volume was also produced called "Implementing Reproducible Research" was published in 2014, as a part of the R-Series. These chapters were alos made openly available for download on the Center for Open Science website. 

This project advanced our understanding, across a variety of fields, of publisher practices and policy development regarding the inclusion of data and code with the publication of articles that contain computational results. This is important because since access to data and code that underlie published results can increase their reliability by creating opportunities the verify findings by permitted others to regenerate the figures and tables from the paper. It also can encourage the reuse of data and code in new experiments, and permit the examination of methods and data integrity that would not be possible without data and code access. Understanding the development of journal policies is important since the published research article is the natural method for discovering the data and code associated with published computational results.


Last Modified: 03/21/2015
Modified by: Victoria Stodden

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page