
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | August 23, 2011 |
Latest Amendment Date: | July 23, 2013 |
Award Number: | 1152481 |
Award Instrument: | Standard Grant |
Program Manager: |
Sylvia Spengler
sspengle@nsf.gov (703)292-7347 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2011 |
End Date: | February 28, 2014 (Estimated) |
Total Intended Award Amount: | $300,000.00 |
Total Awarded Amount to Date: | $360,000.00 |
Funds Obligated to Date: |
FY 2013 = $60,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
1895 PRESTON WHITE DR RESTON VA US 20191-5469 (703)620-8990 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
1895 Preston White Drive Reston VA US 20191-5434 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Info Integration & Informatics |
Primary Program Source: |
01001314DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Large-scale science and engineering campaigns have typically considered data management from the inception of the project and funds for data management have been included in the projects' budgets. This proposal is aimed at data acquisition and curation strategies in support of single PI or small group research projects at academic institutions, data in the so-called "long tail". Long-term data management in these projects is much more problematic and particularly acute. Smaller research projects are often strapped for funds to conduct the research that generates the data; management of the data was in the past often an afterthought. With data management plans now being required by funding agencies, the issues must be considered as part of a proposal, but the funding available for date management is still frequently small and economical resources available to researchers still need to be cultivated. At academic institutions, the institutional repository (IR) has emerged as the means of harnessing technology to improve scholarly communication and it is the IR that offers the potential to address the data curation problems of smaller projects. Although institutional repositories have a broad intuitive appeal to all the stakeholders involved with science and engineering data management, they have met with very limited acceptance in practice. This proposal seeks to increase faculty contribution of their data to the IR by appealing to their needs directly and providing them with tools and support for developing personal repositories that can subsequently be federated into the IR. The strategy is to lower the barrier of entry to archiving facilities and to provide incentives for researchers to participate.
Broader impacts will be realized in two key areas. First, archive and preservation of datasets will be enhanced by increasing the participation of faculty and researchers generating data at the nation's research institutions. Second, open source software well be available for deployment by other institutions beyond the project's partners thereby increasing the effectiveness of dataset archiving and sharing across a growing set of participating institutions. The project will also offer research and training opportunities to undergraduate and graduate students involved as software developers and data consultants who interact with faculty and other researchers as part of the project.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Data acquisition and archiving is a necessary step in the preservation of data for subsequent access and analysis by current researchers, as well as new generations of researchers. The NSF INSPIRE project and its OmniMea prototype (http://omnimea.org) were aimed at data acquisition and curation strategies in support of single PI or small group research projects at academic institutions. Long-term data management in these projects is problematic and particularly acute. Smaller research projects are often strapped for funds to conduct the research that generates the data; management of the data was in the past often an afterthought.
The INSPIRE project considered ways in which a user-centric concept of personal repositories might be used to capture and organize intellectual output as a first step to placing selected material in an institutional repository for long-term curation.
Early in the project it became obvious that before we could adequately address the capture and submission of research data to the institutional repository it would be necessary to address the issue of providing an appropriate user interface to personal research materials. This would be key to acquiring all research artifacts, not just research data.
We chose the approach and organizational structure of individual CVs. This approach was motivated by our belief that the CV is a natural user interface to an academic's personal repository. The challenge is to get academics to use a personal repository-based CV in preference to their present choice, e.g., MS Word. Our approach was to provide extremely low barrier tools for importing information and sufficient value to the transition that one would elect to undertake a small amount of manual work for the subsequent benefits. We implemented both a straightforward single-item import capability and a bulk import capability.
The power of our approach is that the imported CV is not treated as a monolithic entity. Each item imported into the OmniMea personal repository, be it a journal article, a data set, or a professional activity, is represented as a digital object. This representation allows a great deal of flexibility and allowed us to support a number of key benefits, including the potential to crowd-source metadata for institutional repositories. This is possible because only one author of a multi-authored paper need enter it into OmniMea, because the paper will accrete to the CV’s of all authors registered with the service.
Over the course of the project, we developed two prototype demonstration systems to explore the issues involved and to provide a framework for gathering feedback from research communities. The first prototype led to one major realization and a common theme in user feedback.
- We found that flexibility of organization was required, even for a relatively structured format such as a CV.
- We realized that the information contained in a fully-populated personal repository would be sufficient to populate annual reports, shorter versions of a CV, biographical sketches, etc. Researchers routinely expressed interest in tools that would help them automate these tasks.
During the second year of the project we redesigned the CV prototype to operate on a generic infrastructural framework that supports "collections of collections." This change gave us the ability to easily incorporate new user services that can effectively transform a CV into some useful derived product.
Overall, we found that a successful system would require a robust service with low barrier to entry and significant value to motivate entry. We identified five major points related to lowering the barrier of entry.
- Provide a low-effort way for users to input information. For example, cut/paste entry of publications from a traditional CV.
- Provide fo...
Please report errors in award information by writing to: awardsearch@nsf.gov.