Office of Polar Programs Repository & Resource Page
Motivation: The NSF policy on dissemination and sharing of research results states that Principal Investigators (PIs) are expected to share the data, physical samples, code, and other supporting materials (characterized as "data" below) created or gathered in the course of work under NSF-funded awards with other researchers and the public within a reasonable time. Data-sharing holds numerous benefits, including enabling broader research collaboration, facilitating transparency, solidifying confidence in scientific research, enabling reuse of data or samples, and providing increased resources for teaching and education purposes. Research articles containing a link to data in a repository have markedly higher usage and visibility. Discoverable and citable data also serve to reduce barriers to entry for junior researchers, scientists from under-served communities, and researchers from underrepresented and minority groups, thus enabling improved implementation of open science principles. With these aims in mind, OPP expects all data, samples, and code to be as open as possible within ethical parameters.
As expressed in NSF 22-106 ("Dear Colleague Letter: Office of Polar Programs Data and Sample Management Policy"), instead of generalist repositories, OPP PIs should make full use of community-accepted, disciplinary repositories. OPP-funded data centers can support PIs in developing and implementing their Data Management Plans (DMPs) which will provide open, ethical, and timely access to quality-controlled and fully documented data, samples, software/code, and products during and at the conclusion of the project.
In addition to conversations with the relevant Program Officer(s), this resource page is intended to help OPP PIs identify appropriate repositories and resources as they develop and implement DMPs. Each DMP should be appropriate for the data, software/code, physical samples, and other research materials being generated, and each should reflect the best practices and standards in the area(s) of research being proposed (e.g., Indigenous Knowledge protocols, vertebrate subject protocols, environmental regulations, permitting, etc.).
Considerations include the following:
- The repository must be a long-lived and publicly accessible archive, except with well-justified and special program officer permission (e.g., in the case no appropriate public repository exists).
- The repository must mint persistent unique identifiers (e.g., DOI for data/code, IGSN for samples, and ORCID for authors) and label products with an open/appropriate license (e.g., Creative Commons' CC-0 or CC-BY or Traditional Knowledge (TK) labels for data, and MIT, BSD, or others for software/code).
- The repository should adhere to standards that help them align with the FAIR data principles (Findability, Accessibility, Interoperability, and Reusability) and the CARE principles for Indigenous data governance (Collective Benefit, Authority to Control, Responsibility, Ethics) (e.g., TRUST principles and/or CoreTrustSeal).
- The Arctic Sciences Section prefers data to be deposited in the Arctic Data Center (ADC), although specialist disciplinary repositories may also be appropriate for primary or secondary data deposit (e.g., for sensitive data or to increase findability).
- As part of the data submission process, metadata should be submitted to the data repository describing observing sites (stations, plots, moorings, and other fixed platforms) as well as mobile platforms (buoys, vessels, aircraft, etc.).
- If you are managing data related to sustained Arctic observations, PIs should follow recommendations from the Sustaining Arctic Observing Networks (SAON), consult the Arctic Research Mapping Application (ARMAP), and include observing-level metadata along with the sample-level metadata that they report/archive.
- The Antarctic Sciences Section prefers data to be deposited in a disciplinary repository, with metadata required to also be reported to the US Antarctic Program Data Center (USAP-DC).
OPP funds data centers to help PIs and researchers to meet their needs for archiving and sharing data and products — including support for the development of DMPs, minting DOIs, facilitating data access, providing training, and more. OPP recommends that PIs:
- Communicate early with target repositories to ensure that appropriate resources are available,
- Use archived examples and/or the repository's preferred tool to create DMPs (e.g., the DMPTool for the ADC or ezDMP for the USAP-DC),
- Take advantage of free training in general data management best practices and the use of NSF-funded repositories and community offices (e.g., ADC, USAP-DC, ESIP, and others), and
- Share DMPs with OPP-funded repositories to help the repositories serve users, provide examples for others, and encourage accountability.
OPP-Funded Repositories
- Arctic Data Center (ADC)
- Antarctic Meteorological Research & Data Center (AMRDC)
- Biological & Chemical Oceanography Data Management Office (BCO-DMO)
- Environmental Data Initiative for LTER data
- Exchange for Local Observations and Knowledge of the Arctic (ELOKA)
- Antarctic Core Collection at the Marine & Geology Repository
- NSF Ice Core Facility (ICF)
- Polar Geospatial Center (PGC)
- Topographic maps and aerial photographs of Antarctica are also available from the USGS.
- The USGS U.S. Antarctic Research Center also has a searchable database of Antarctic place names, maps, and photographs.
- With funding from NSF, the USGS, NASA, and the British Antarctic Survey have collaborated to provide the Landsat Image Mosaic of Antarctica (LIMA) and the MODIS Mosaic of Antarctica (MOA).
- Polar Rock Repository (PRR)
- US Antarctic Program Data Center (USAP-DC)
- The American Geological Institute maintains the world's most complete Antarctic bibliography. (NOTE: The bibliographies were last updated on September 30, 2011, except for limited additions regarding permafrost-related publications.)
- Includes the former Antarctic Glaciological Data Center (AGDC), previously hosted at NSIDC
Representative Disciplinary Repositories
- CCHDO for hydrographic data
- CUAHSI Hydroshare for hydrologic data
- EarthChem for geochemical and chronological data
- GenBank for sequence data
- IRIS for seismological data
- ICE-D for cosmogenic-nuclide exposure dating data
- Inter-university Consortium for Political and Social Research (ICPSR)
- JASADCP for ocean current data
- MGDS/ASP for marine geological/geophysical data
- NCAR for meteorology, atmospheric parameters, etc.
- NCEI for paleoclimate data
- NSIDC for cryospheric data - PIs will need a letter of collaboration
- OCE-approved repositories for oceanographic data
- Qualitative Data Repository (QDR)
- SESAR for sample information and registration
- tDAR - The Digital Archaeological Record
- UNAVCO
Generalist Repositories
- Zenodo (Note that while GitHub may be appropriate for code collaboration and versioning, it is not an appropriate long-lived archive. GitHub can easily deliver a static, citable version to a generalist repository like Zenodo, which will mint a DOI, with metadata reported to the ADC and/or USAP-DC, or static versions can be deposited in appropriate disciplinary repositories above, which will also mint a DOI.)
- Searchable global registries of data repositories provide information on indexed repositories to help researchers identify the most appropriate ones, for example, Registry of Research Data Repositories and Designing Materials to Revolutionize our Future Software & Data.
Other Resources:
- NSF Public Access Repository
- NASA's Transformation to Open Science (TOPS) initiative
- Curating for Reproducibility (CURE) resources for data quality review
If you have suggestions for repositories or resources that should be included on this list, please contact oppdata@nsf.gov and oppcomms@nsf.gov.