
NSF Org: |
RISE Integrative and Collaborative Education and Research (ICER) |
Recipient: |
|
Initial Amendment Date: | August 13, 2014 |
Latest Amendment Date: | August 13, 2014 |
Award Number: | 1440315 |
Award Instrument: | Standard Grant |
Program Manager: |
Eva Zanzerkia
RISE Integrative and Collaborative Education and Research (ICER) GEO Directorate for Geosciences |
Start Date: | September 1, 2014 |
End Date: | August 31, 2017 (Estimated) |
Total Intended Award Amount: | $647,149.00 |
Total Awarded Amount to Date: | $647,149.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
506 S WRIGHT ST URBANA IL US 61801-3620 (217)333-2187 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
Suite A, 1901 South First Street Champaign IL US 61820-7406 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | EarthCube |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.050 |
ABSTRACT
The project offers a unique and transformative approach to integrate existing and emerging long-tail model and data resources. Many challenges hinder the seamless integration of models with data. These challenges compel scientists to perform the integration process manually. The primary challenges are a consequence of the knowledge latency between model and data resources and others are derived from inadequate adoption and exploitation of information technologies. Knowledge latency challenges increase exponentially when a user aims to integrate long-tail data (data collected by individual researchers or small research groups) and long-tail models (models developed by individuals or small modeling communities).The goal of this research is to develop a framework rooted in semantic techniques and approaches to support ?long-tail? models and data integration. The vision is to develop a decentralized knowledge-based platform that can be easily adopted across geoscience communities comprising of individual and small group researchers.
This project offers a unique and transformative approach to integrate existing and emerging long-tail model and data resources. The project will develop a knowledge framework to close the loop from models? queries back to data sources by first investigating the required concepts architecture for integrating two leading examples of long-tail resources in geoscience: Community Surface Dynamic Modeling System (CSDMS) and Sustainable Environment Actionable Data (SEAD). The project will also develop a context-based data model that provides an explicit interpretation of a metadata attribute. The researchers will capture the metadata concepts and semantic from various geo-informatics systems and provide tools for ensuring conceptual integration between the resources. Next, the project will develop a knowledge discovery tool that allows automated coupling of a model and data coming from different contributors. Finally, the project will provide a prototype physical implementation of the knowledge framework in CSDMS modeling framework to demonstrate how it can advance the seamless discovery, selection, and integration between models and data, and how to achieve dynamic reusability of resources across multiple Earth Science long-tail resources.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Many societally relevant scientific challenges are becoming increasingly more interdisciplinary, often requiring the engagement of a broad range of disciplines. Among the most important challenges is our ability to couple models that have evolved in sophistication to address disciplinary challenges and that are poised to support new inter-disciplinary investigations through suitable coupling; and our ability to integrate existing and emerging heterogeneous data with such coupled models to address novel and emerging scientific questions. From such a data-model integration perspective, two core issues determine the complexity of scientific investigations and limits of their success. First, integration across various models is daunting because often they incorporate different variable names and units for the same concept, run at different times steps, use different naming and reference conventions, use different computational grids, etc. Second, their integration with inter-disciplinary data systems is also challenging in that often they encompass a heterogeneous collection with many dimensions, coordinate systems, scales, variables, providers, users and scientific contexts. Our project has developed a framework for seamless integration approach for the inter-operability of data and models. We have developed technologies that enable us to overcome the barriers of model to model coupling, and model and data integration. We use a “micro-service” approach for semantic annotation to enrich metadata using controlled vocabularies so the data associated with appropriate variables can be discovered for use by models within the spatial and temporal context of a model run. Micro-service architecture is an emerging approach that allows application to be built as a suite of small (or elementary) web services that communicate with lightweight mechanisms, Further we have developed Resource Alignment Services (RAS) so the data can be appropriately structured to enable it to be directly ingested by models. To couple models, we use model-as-a-service paradigm and execute them using the data available after resource alignment is completed. These developments have established the feasibility of seamless integration between models, and between models and data.
We have developed a decentralized framework that combines the Linked Data and RESTful web services to annotate, connect, integrate, and reason about integration of geoscience resources. The framework allows the semantic enrichment of web resources and semantic mediation among heterogeneous geoscience resources, such as models and data (Figure 1). It uses micro-service architecture to close the semantic loop among data, models, and Controlled Vocabularies (CV). First, the framework has demonstrated its ability to advance the semantic interoperability among distributed data and models through the development of three sets of micro-services: (i) Knowledge Integration Services (KIS), which ingests, registers, and checks-in Controlled Vocabularies and W3C standards to the framework’s Knowledge-base; (ii) Semantic Annotation Services (SAS), which annotates resources with their spatiotemporal context, variable, and provenance relationships, either by running automatic extractors based on the data files MIME type (e.g. GeoTIFF and CSV types) or by providing an interactive interface for manual annotation; and (iii) Resource Alignment Service (RAS), which is a scientific workflow to align the attributes associated with two geo-resources to ensure their semantic consistency before integration. SAS is currently available in the Clowder data management system (https://clowder.ncsa.illinois.edu/), which is scalable data repository for sharing, organizing, and analyzing long-tail data, i.e., data collected by small scientific communities and individual researchers.
Second, the GeoSemantics project provides an information system to represent each Standard Name as a single unique web entity, describe its attributes, and annotate the semantic relationships among Standard Names. This information system supports the semantic mediation across domain specific vocabularies. Each Standard Name is indexed by HTTP URL (Linked Data essential rule), has its own HTML page that displays its attributes and facilitates the visual navigation among related Standard Names. Simple Knowledge Organization System (SKOS) Ontology is used to describe the associated semantic relationships among Standard Names across their namespace. A RESTFull API (http://ecgs.ncsa.illinois.edu/skosmos) is added on top of the information system to allow the programmatic knowledge discovery and dereferencing of the Standard Names. This information system allows navigation among domain specific vocabularies, advances the programmatic search capabilities for Standard Names, and provides a structured concept display. In addition, it provides the technology required for building a multilingual user interface.
Finally, the GeoSemantics framework supports the integration of models using the approach of wrapping a service architecture around them, which increases their interoperability and reusability. This is particularly attractive for coupling long-tail geoscience models, i.e., models that are created and managed by individuals and small research groups. In order to overcome the heterogeneity in defining, developing, coupling, and maintaining long-tail geoscience models, the GeoSemantics project has built on several CSDMS (Community Surface Dynamics Modeling System) technologies including the Basic Model Interface, the CSDMS Standard Names and EMELI (Experimental Modeling Environment for Linking and Interoperability). The project has developed a cloud-based environment for storing and coupling of web-based models.
The resources from the project are available at: http://hcgs.ncsa.illinois.edu
Last Modified: 11/30/2017
Modified by: Praveen Kumar
Please report errors in award information by writing to: awardsearch@nsf.gov.