Award Abstract # 1745675
EAGER: DMP Roadmap: Making Data Management Plans Actionable

NSF Org: RISE
Integrative and Collaborative Education and Research (ICER)
Recipient: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
Initial Amendment Date: August 29, 2017
Latest Amendment Date: March 15, 2019
Award Number: 1745675
Award Instrument: Standard Grant
Program Manager: Eva Zanzerkia
RISE
 Integrative and Collaborative Education and Research (ICER)
GEO
 Directorate for Geosciences
Start Date: September 1, 2017
End Date: August 31, 2021 (Estimated)
Total Intended Award Amount: $275,178.00
Total Awarded Amount to Date: $275,178.00
Funds Obligated to Date: FY 2017 = $275,178.00
History of Investigator:
  • Guenter Waibel (Principal Investigator)
    guenter.waibel@ucop.edu
Recipient Sponsored Research Office: University of California, Office of the President, Oakland
1111 FRANKLIN ST FL 8
OAKLAND
CA  US  94607-5201
(510)987-9850
Sponsor Congressional District: 12
Primary Place of Performance: University of California, Office of the President, Oakland
415 Thomas Berkley Way
Oakland
CA  US  94612-2901
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): PKK5TD16N4H1
Parent UEI:
NSF Program(s): BIOLOGICAL OCEANOGRAPHY,
EarthCube
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1650, 7916
Program Element Code(s): 165000, 807400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.050

ABSTRACT

The California Digital Library (CDL) will work with an extensive coalition of national and international collaborators to convert static data management plans (DMPs) into machine-actionable documents useful for structuring the course of research activities and communicating with other systems. Open data policies are proliferating worldwide and researchers are now required to submit DMPs with most grant proposals that describe the data they will produce and plans for sharing and preserving it. Researchers do not always know exactly what data they will produce at the beginning of a project, however. Furthermore, they have no incentives or easy methods for updating a DMP to keep things organized over the course of their research, which can lead to poor data practices and chaotic, unusable data shared at the end. DMPs in their current, static form pose similar challenges for other stakeholders across the research ecosystem, e.g., funders who must monitor compliance manually. This project will solve this problem by building out DMPRoadmap, a new, internationalized platform to reposition DMPs as true hubs of the networked research ecosystem. The principal impact of the proposed project is to transform one component of the increasingly digital research enterprise, the DMP, into an actively updated and machine-actionable hub to document and disseminate the products of research activity. Converting free-text responses to funder requirements into dynamic, verifiable data feeds will vastly improve the entire toolchain for all stakeholders and increase the velocity and availability of information across all disciplines. The software outputs of the pilot will be shared publicly with an open source license so that the community can continue to enhance and reuse the technology, extending the scholarly cyberinfrastructure in a scalable manner. All supporting data collected during the project will be made fully available for human and machine consumption. Since DMPs are rapidly becoming a global phenomenon, the outputs of the project will be of great interest to the entire research community. The breadth of the impact extends into the future of DMP policies worldwide, as machine-actionability makes change easier and continuous improvements possible. The project will provide insight into how to maximize investment in institutions and infrastructures in a manner that achieves policy goals by providing public access to government-funded research, and advances the overall public good. By advancing best practices for data-intensive research across all disciplines the project will further the goals of the scientific endeavor.

The CDL will design, develop, and prototype machine-actionable DMPs based on community generated use cases; field research on DMP and data management practices; and pilot projects conducted with disciplinary and institutional partners. The work plan encompasses iterative phases of developing, implementing, and testing with various stakeholder groups using an agile methodology. Specific use cases include implementing a set of common standards and exchange protocols for DMPs to enable information to flow between DMPs and existing research information systems (e.g., offices of research, data repositories, faculty profile systems); leveraging persistent identifiers (e.g., DOIs for articles and datasets, ORCID iDs for people) to trigger push/pull notifications across systems that enable stakeholders to plan resources, connect research outputs, automate reporting and monitoring, get credit, and promote data discoverability, reuse, and reproducibility; among others.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Miksa, Tomasz and Simms, Stephanie and Mietchen, Daniel and Jones, Sarah "Ten principles for machine-actionable data management plans" PLOS Computational Biology , v.15 , 2019 https://doi.org/10.1371/journal.pcbi.1006750 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Open data policies are proliferating worldwide, and researchers are required to submit Data Management Plans (DMPs) with most proposals for grants or funding that supports their work. These DMPs describe the data generated in a research project and outlines plans for sharing and preserving this data with the community. Unfortunately, researchers currently have neither incentives nor easy methods for updating a DMP after their project is underway, leading to poor data stewardship practices overall and often resulting in research data that others cannot understand or reuse. 

DMPs traditionally have been static text documents that outline plans and practices. However, this fixed, narrative format challenges stakeholders across the research ecosystem. For example, funders must monitor compliance with data sharing requirements by manually checking the status of a project’s handling of research data. 

The Networked DMP is a suite of tools developed by the California Digital Library with support from the National Science Foundation that advances data management policies and local requirements for data sharing to facilitate and accelerate the research process. This project and associated services successfully repositioned DMPs from text-based planning documents to dynamic hubs in the research ecosystem, providing automatically updated key information about the research process for all stakeholders that is integrated with other core systems and tools in the research process.

A principal outcome of the NSF grant, Making Data Management Plans Actionable, was the development of the technical standards and infrastructure needed to convert DMPs from static text documents into a structured format with robust metadata and the assignment of a persistent identifier to the DMP.  Working together, the project metadata and unique identifier generate a DMP that is structured consistently so that the rich information contained within it can be shared pragmatically between systems. In addition, this new structured document allows for automated notifications, verification, and reporting in real-time. As a result of this project, we now have unique and persistent identifiers called DMP IDs for data management plans. The DMP ID creates an unbreakable link between the plan, and the research project’s contributors, outputs, data repositories, and funding sources. 

We will continue to release new features to expand the possibilities of the Networked Data Management Plan, helping to ensure transparency in the research process and promote good data management practices for researchers. Many of these new workflows are being pilot tested as part of the NSF-funded FAIR Island Project, a collaboration between the California Digital Library and the University of California Gump South Pacific Research Station and the University of California Natural Reserves System. Through the FAIR Island Project, we will utilize and build on the Networked DMP to track all research outputs generated from work completed at field stations and natural reserves.


Last Modified: 12/10/2021
Modified by: Guenter Waibel

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page