
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | August 14, 2014 |
Latest Amendment Date: | July 29, 2019 |
Award Number: | 1442997 |
Award Instrument: | Standard Grant |
Program Manager: |
Amy Walton
awalton@nsf.gov (703)292-4538 OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | November 1, 2014 |
End Date: | October 31, 2019 (Estimated) |
Total Intended Award Amount: | $1,424,765.00 |
Total Awarded Amount to Date: | $1,424,765.00 |
Funds Obligated to Date: |
FY 2019 = $0.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
77 MASSACHUSETTS AVE CAMBRIDGE MA US 02139-4301 (617)253-1000 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
77 Massachusetts Ave Rm NE18-901 Cambridge MA US 02139-4307 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
AERONOMY, Data Cyberinfrastructure, EarthCube |
Primary Program Source: |
01001920DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Next-generation Geoscience needs to handle rapidly growing data volumes from ground-based and space-based sensor networks. As real-world phenomena are mapped to data, the scientific discovery process essentially becomes a search process across multidimensional data sets. The extraction of meaningful discoveries from this sea of data therefore requires highly efficient and scalable machine assistance to enhance human contextual understanding. This is necessary both for testing new hypotheses as well as for the detection of novel events and monitoring for natural hazards.
This project develops a computer-aided discovery approach that provides scientists with better support to answer questions such as: What inferences can be drawn from an identified feature? What does a finding mean and how does it fit into the big theoretical picture? Does it contradict or confirm previously established models and findings? How can concepts and ideas be tested effectively? To achieve this, scientists can programmatically express hypothesized Geoscience scenarios, constraints, and model variations. This approach helps delegate the automatic exploration of the combinatorial search space of possible explanations in parallel on a variety of data sets. Furthermore, programmable crawlers can scale the search and discovery of interesting phenomena on cloud-based infrastructures. The computer-aided discovery prototype is evaluated in case studies from Geospace science, including the exploration of structures in space and time using combined GPS, optics, and Geospace radar data.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Scientific discovery traditionally involves processes that require a significant number of manual steps and focused human attention. However, increasing data volumes are challenging this traditional approach. Examples are found in geoscience, space science, planetary science, astronomy, and other disciplines that are undergoing a Big Data transformation.
This research therefore explored how intelligent systems for computer-aided discovery can routinely complement and integrate human scientists in the insight generation loop in scalable ways for next-generation science. Data infrastructure building blocks and related software prototypes were developed to facilitate the access to scientific data sets, the fusion of various data, and the search for new discoveries. Cloud computing environments were tested as new platforms for data provisioning and scalability of discovery workflows.
The project released open source packages on Github, such as scikit-data access and scikit-discovery that can be used by the general public under the MIT license. The science-casestudies repository also includes open source releases of Python Jupyter notebooks that provide demonstrations for accessing and using scientific data sets to scientists, students, educators, as well as the general public.
Successful applications of computer-aided discovery were demonstrated in several areas, including ionospheric studies, volcanics, seismology, astronomy, and planetary science. Results of this work have also been featured in the press multiple times.
The fertile environment of this project has facilitated a broad collaboration and joint publications spanning different academic institutions and industry. It enabled the exposure and/or participation of undergraduate and graduate students, postdoctoral fellows, and scientific staff.
Last Modified: 07/29/2019
Modified by: Victor Pankratius
Please report errors in award information by writing to: awardsearch@nsf.gov.