
NSF Org: |
DBI Division of Biological Infrastructure |
Recipient: |
|
Initial Amendment Date: | August 30, 2017 |
Latest Amendment Date: | January 26, 2018 |
Award Number: | 1661529 |
Award Instrument: | Standard Grant |
Program Manager: |
Peter McCartney
DBI Division of Biological Infrastructure BIO Directorate for Biological Sciences |
Start Date: | September 1, 2017 |
End Date: | November 30, 2020 (Estimated) |
Total Intended Award Amount: | $336,493.00 |
Total Awarded Amount to Date: | $336,493.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
414 E CLARK ST VERMILLION SD US 57069-2307 (605)677-5370 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
SD US 57069-2307 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | ADVANCES IN BIO INFORMATICS |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.074 |
ABSTRACT
The millions of species that inhabit the planet all have distinct biological traits that enable them to successfully compete in or adapt to their ecological niches. Determining accurately how these traits evolved is thus fundamental to understanding earth's biodiversity, and to predicting how it might change in the future in response to changes in ecosystems. Although sophisticated analytical methods and tools exist for analyzing traits comparatively, applying their full power to the myriad of trait observations recorded in the form of natural language descriptions has been hindered by the difficulty of allowing these tools to understand even the most basic facts implied by an unstructured free-text statement made by a human observer. The technological arsenal needed to overcome this challenge is now in principle available, thanks to a number of recent breakthroughs in the areas of knowledge representation and machine reasoning, but these technologies are challenging enough to deploy, orchestrate, and use that the barriers to effectively exploit them remains far too high for most tools. This project will create infrastructure that will dramatically reduce this barrier, with the goal of providing comparative trait analysis tools easy access to algorithms powered by machines reasoning with and making inferences from the meaning of trait descriptions. Similar to how Google, IBM Watson, and others have enabled developers of smartphone apps to incorporate, with only a few lines of code, complex machine-learning and artificial intelligence capabilities such as sentiment analysis, this project will demonstrate how easy access to knowledge computing opens up new opportunities for analysis, tools, and research. It will do this by addressing three long-standing limitations in comparative studies of trait evolution: recombining trait data, modeling trait evolution, and generating testable hypotheses for the drivers of trait adaptation.
The treasure trove of morphological data published in the literature holds one of the keys to understanding the biodiversity of phenotypes, but exploiting the data in full through modern computational data science analytics remains severely hampered by the steep barriers to connecting the data with the accumulated body of morphological knowledge in a form that machines can readily act on. This project aims to address this barrier by creating a centralized computational infrastructure that affords comparative analysis tools the ability to compute with morphological knowledge through scalable online application programming interfaces (APIs), enabling developers of comparative analysis tools, and therefore their users, to tap into machine reasoning-powered capabilities and data with machine-actionable semantics. By shifting all the heavy-lifting to this infrastructure, tools can programmatically obtain answers to knowledge-based questions that would otherwise require careful study by a human export, such as objectively and reproducibly assessing the relatedness, independence, and distinctness of characters and character states, with only a few lines of code. To accomplish this, the project will adapt key products and know-how developed by the Phenoscape project, including an integrative knowledgebase of ontology-linked phenotype data, metrics for quantifying the semantic similarity of phenotype descriptions, and algorithms for synthesizing morphological data from published trait descriptions. To drive development of the computational infrastructure and to demonstrate its enabling value, the project's objectives focus on addressing three concrete long-standing needs for which the difficulty of computing with domain knowledge is the major impediment: (1) computationally synthesizing, calibrating, and assessing morphological trait matrices from across studies; (2) objectively and reproducibly incorporating morphological domain knowledge provided by ontologies into evolutionary models of trait evolution; and (3) generating testable hypotheses for adaptive diversification by incorporating semantic phenotypes into ancestral state reconstruction and identifying domain ontology concepts linked to evolutionary changes in a branch or clade more frequently than expected by chance. In addition, to better prepare evolutionary biologist users and developers of comparative analysis tools for adopting these new capabilities, a domain-tailored short-course on requisite knowledge representation and computational inference technologies will be developed and taught. More information on this project can be found at http://cate.phenoscape.org/.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.