Award Abstract # 1540996
EarthCube IA: Collaborative Proposal: LinkedEarth: Crowdsourcing Data Curation & Standards Development in Paleoclimatology

NSF Org: RISE
Integrative and Collaborative Education and Research (ICER)
Recipient: NORTHERN ARIZONA UNIVERSITY
Initial Amendment Date: July 28, 2015
Latest Amendment Date: July 28, 2015
Award Number: 1540996
Award Instrument: Standard Grant
Program Manager: Eva Zanzerkia
RISE
 Integrative and Collaborative Education and Research (ICER)
GEO
 Directorate for Geosciences
Start Date: September 1, 2015
End Date: August 31, 2018 (Estimated)
Total Intended Award Amount: $113,015.00
Total Awarded Amount to Date: $113,015.00
Funds Obligated to Date: FY 2015 = $113,015.00
History of Investigator:
  • Nicholas McKay (Principal Investigator)
    Nicholas.McKay@nau.edu
Recipient Sponsored Research Office: Northern Arizona University
601 S KNOLES DR RM 220
FLAGSTAFF
AZ  US  86011
(928)523-0886
Sponsor Congressional District: 02
Primary Place of Performance: Northern Arizona University
NAU Box 4099
Flagstaff
AZ  US  86011-0001
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): MXHAS3AKPRN1
Parent UEI:
NSF Program(s): EarthCube
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7433
Program Element Code(s): 807400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.050

ABSTRACT

Natural climate variability signficantly modulates anthropogenic global warming, and only paleoclimate observations can adequately constrain it. Moreover, such observations are most powerful when many records are brought together to provide a spatial understanding of past variability. However, there is currently no universal way to share paleoclimate data between users or machines, hindering integration and synthesis. Large-scale, international, paleoclimate data syntheses have a long and successful history, but have been needlessly labor-intensive. Recognizing that (1) paleoclimate data curation requires expert knowledge; (2) top-down data management approaches are ineffectual; (3) existing infrastructure does not foster standardization; there emerges a critical need for a flexible platform enabling crowdsourced data curation and standards development.The platform will be combined with editorial and community-driven processes which will result in a system that has the potential to engage a broad user base in geoscientific data curation. The proposed framework will lower barriers to participation in the geosciences, enabling more "dark data" to join the public domain using community-sanctioned protocols. The pilot project will facilitate the work of hundreds of paleoclimate scientists, accelerating scientific discovery and the dissemination of its results to society.

Semantic wikis provide a simple, intuitive interface to semantic languages and infrastructure that build on open Web architecture. Like traditional wikis, they enable the collaborative authoring of content. Secure access and time-stamped content also enable the tracking of changes and the accountability of users, as well as moderation capabilities by community members of recognized expertise. In contrast to traditional wikis, semantic wikis allow contributors to assign meaning to their content, specifying relationships between the objects they describe. This enables artificial intelligence reasoners to parse, process and translate these data into more useful forms. The technology is well-proven, scalable, and completely transparent to the user, requiring no computer science knowledge or more sophisticated technology than a web browser. The LinkedEarth Wiki will automatically translate this information into Linked Open Data, a universal format to share data across the Web. To demonstrate this concept?s broad applicability across paleoclimate science, the project?s target community is the PAGES2k consortium, an international collaboration dedicated to the climate of the Common Era. Social technologies will be developed to power collective curation, standards development and quality control by the community itself. The project will demonstrate applicability to other paleogeosciences, serving as a potential template for other geoscientific disciplines.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Emile-Geay, J., McKay, N., and the PAGES 2k Consortium "A global multiproxy database for temperature reconstructions of the Common Era" Scientific Data , v.4 , 2017 10.1038/sdata.2017.88
Gil, Y., Garijo, D., V. Ratnakar, D. Khider, J. Emile-Geay, and McKay, N. P. "A controlled crowdsourcing platform for high-quality ontology development and data an- notation" International Semantic Web Conference , 2017
Gregory Hakim and Sylvia Dee and Julien Emile-Geay and McKay, N. P. and Kira Rehfeld "Accelerating Progress in Proxy--Model Synthesis Using Open Standards" PAGES News , v.26 , 2018 , p.73
Julien Emile-Geay and Deborah Khider and McKay, N. P. and Daniel Garijo and Yolanda Gil and Varun Ratnakar "{LinkedEarth}: supporting paleoclimate data standards and crowd curation" PAGES News , v.26 , 2018 , p.62-63
McKay, N. P. and Julien Emile-Geay "{Linked PaleoData}: A resource for open, reproducible, and efficient paleoclimatology" PAGES News , v.26 , 2018 , p.71
PAGES 2k Consortium (Emile-Geay & McKay 1st and 2nd Authors) "A globalmultiproxy database for temperaturereconstructions of the Common Era" Scientific Data , v.4 , 2017 10.1038/sdata.2017.88

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The need for geoscientists to share data with each other, with stakeholders, and with Earth system models, has never been greater. Until recently, there have been very limited ways to do this, leaving much geoscientific data in proverbial drawers, or shared online but in unstructured formats that limit their usability. Over the past three years, the LinkedEarth project has made important advances towards removing such boundaries and providing unified cyberinfrastructure solutions to geoscientists. The philosophy behind LinkedEarth is that (1) geoscientists are the most qualified to curate their data or that of their peers; (2) top-down data management approaches often cannot meet scientists needs; and (3) semantic approaches to data management need to be explored and adapted for the paleogeosciences.

Over the course of this project, we developed a Semantic Wiki that provides a simple, intuitive interface to semantic languages and infrastructure and is built on Open Web architecture. We have recruited more than 150 users, who have contributed and curated more than 700 paleoclimate datasets in total. We also built a semantic ontology (http://linked.earth/ontology/) to support the Linked Earth collaborative platform (wiki). The ontology is split into two parts: the first is a "core" ontology that is primarily a data model for paleoclimate datasets, mapped to the Linked Paleo Data (LiPD) format pioneered by Co-investigators McKay and Emile-Geay. This ontology is "stiff", i.e., it can only be updated via a specific editorial process that requires consensus of a board of community representatives. The second component of the ontology is a "crowd" ontology that can be edited dynamically through the wiki as users update terms and the connections between them, and is thus very fluid.

A major aspect of the project was developing connections between cyberinfrastructure in the paleogeosciences, and building a community of scientists educated and interested in community data curation. In June 2016 we hosted the first ever workshop on Paleoclimate Data Standards (http://linked.earth/event/paleoclimate-ontology-workshop/) at the National Climatic Data Center. The 2-day workshop gathered around 40 participants to establish preliminary standards for paleoclimate data. The workshop was primarily supported by this grant, with auxiliary support from PAGEs to ensure international participation. It was hosted by the US National Centers for Environmental Information (NOAA-NCEI) World Data service for Paleoclimatology (WDs-Paleo) in Boulder, CO - a key partner for this project as they will provide long-term sustainability to the technologies incubated herein. The workshop sparked an ongoing conversation on data standards, helped cement a collaboration with NCDC, nucleated working groups on archive-specific data standards, and help coordinate with synergistic projects (NSF GeoLink, Continental Science Drilling, NEOTOMA).

An array of international scientific projects are now underway that leverage the advances and efforts of LinkedEarth, and that will build on future development of LinkedEarth concepts and platforms. This project also trained two postdoctoral scholars, both of whom belong to under-represented minority groups, and fostered immersive research experiences for undergraduates.


 

 


Last Modified: 12/21/2018
Modified by: Nicholas P Mckay

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page