
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | July 29, 2019 |
Latest Amendment Date: | February 20, 2020 |
Award Number: | 1850546 |
Award Instrument: | Standard Grant |
Program Manager: |
Wei-Shinn Ku
IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | August 1, 2019 |
End Date: | October 31, 2021 (Estimated) |
Total Intended Award Amount: | $175,000.00 |
Total Awarded Amount to Date: | $191,000.00 |
Funds Obligated to Date: |
FY 2020 = $16,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
801 UNIVERSITY BLVD TUSCALOOSA AL US 35401 (205)348-5152 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
801 University Blvd Tuscaloosa AL US 35087-0005 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Info Integration & Informatics |
Primary Program Source: |
01002021DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The goal of this project is to investigate novel computational techniques for disciplinary knowledge guided data science methods in geoscience applications. The field of data science has achieved tremendous success over the last decade, not only in business but also in science and engineering. The data-driven approach has been recognized as the "fourth paradigm" of scientific discovery (after experimental, theoretical, and computational simulation). However, when solving interdisciplinary problems, a purely data-driven approach often faces a significant gap in lacking interpretability and consistency with existing theories and knowledge in the discipline, as shown by the famous Google Flue Trend example. The proposed project aims to fill the gap by utilizing disciplinary knowledge to guide data-driven models to enhance interpretability, consistency, as well as prediction accuracy. Specifically, the team will study the problem in the context of spatial structured models for geoscience applications. The team will investigate the utilization of disciplinary knowledge in constructing novel spatial dependency structure and explore efficient algorithms for model learning and inference. Proposed approaches will be validated with interdisciplinary applications in hydrology. The project, if successful, will contribute towards the next generation water resource management for the U.S. in the 21st century. Proposed research can not only improve the situational awareness for disaster response agencies but also enhance the flood forecasting capabilities of the National Water Model. Proposed algorithms will be implemented into open source tools that will enhance the research infrastructure for geoscience communities. Educational activities include curriculum development, mentoring a broad group of high school students in data science seminars at Alabama Computer Science Camps, as well as year-long project for a selected number of high school students for regional Science Fair competition.
The project is expected to result in the following computer science innovations. First, a novel spatial structured model called hidden Markov topography tree (HMTT) will be investigated, which generalizes existing hidden Markov models from total order sequences to partial order poly-trees. Compared with existing spatial structured models (e.g., Markov random field, spatial autoregressive regression) that captures dependency based on spatial proximity, HMTT can potentially reduce the impacts of noise and large obstacles in sample features via more complex structural constraints from disciplinary knowledge in hydrology (e.g., flow directions). Second, efficient computational algorithms to construct topography tree from a large number of locations will be explored. Finally, the team will leverage the poly-tree structure in the hidden class layer, and explore computational pruning to reduce the number of backtracking in existing dynamic programming method for class inference. The idea of integrating disciplinary knowledge (e.g., structural constraints) with data-driven methods can potentially transform data science research by enhancing model interpretability and consistency.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.