
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | August 23, 2022 |
Latest Amendment Date: | November 15, 2023 |
Award Number: | 2212046 |
Award Instrument: | Standard Grant |
Program Manager: |
Jie Yang
jyang@nsf.gov (703)292-4768 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2022 |
End Date: | August 31, 2026 (Estimated) |
Total Intended Award Amount: | $1,129,040.00 |
Total Awarded Amount to Date: | $1,129,040.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
W5510 FRANKS MELVILLE MEMORIAL LIBRARY STONY BROOK NY US 11794-0001 (631)632-9949 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
WEST 5510 FRK MEL LIB Stony Brook NY US 11794-0001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Robust Intelligence |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Artificial intelligence and Machine Learning have in recent years been applied to the analysis of very large scale (VLS) images such as those encountered in the analysis of aerial or satellite imagery and digital histopathology, so that domain scientists can explore the data and form novel hypotheses. The use of the current state-of-the-art deep learning techniques requires vast amounts of detailed annotations (a.k.a. labels) as training data, which can be proportional to the size of the input images. Thus, it is either impossible or very expensive to acquire enough high-resolution training data. In this project, the research team will develop a methodology that uses weaker (or auxiliary) signals collected in much smaller, low-resolution images to efficiently constrain the spatial (or temporal) statistical distribution of the labels in the high-resolution image. The framework significantly reduces the human effort needed for the mundane task of annotating VLS images, which is crucial for several exciting applications to predict environmental trends and cancer treatment outcomes. The developed techniques are general, and their application will be demonstrated in two different domains involving very large images, satellite imagery and digital histopathology. In environmental applications, the ability to directly connect satellite imagery to policy-relevant metrics of interest (e.g., population trends, urbanization, biodiversity loss, etc.) would radically improve our capacity to monitor the globe. Similarly, being able to reliably extract high resolution information from whole slide images of histopathology will be highly useful for cancer research focused on the development of novel diagnostic tests and numerous precision medicine applications (e.g., patient stratification, treatment selection, prediction of disease progression, recurrence, treatment response, and disease-free survival through downstream correlations with clinical, radiologic, laboratory, molecular, pharmacologic, and outcomes data).
The technical aims of the project are: i) The research team addresses the problem of super-resolving dense annotations by matching label statistics across resolutions. The general methodology for differentiable loss functions maps auxiliary constraints to high-resolution labels. Each Label Super-Resolution loss is a differentiable distance metric between a distribution and a set of statistical values; ii) The research team generalizes the concept of super-resolution to topological information (through persistent homology) and use multi-task learning to produce latent representations that can be the basis of various inference tasks; iii) In the developed framework, the research team models missing auxiliary data, heterogeneous auxiliary data, and dynamic image sets of the same area and our losses can be easily integrated in RNN/transformer architectures and adversarial learning paradigms; iv) The research team evaluates two modalities of incremental human engagement: 1) Showing the annotator the effects of their annotation choices to help develop intuition for high return areas and 2) A reinforcement learning based active learning framework that imitates how domain experts select what kinds of data to label; and v) The research team develops and evaluates ideas through a number of well-grounded applications of Label Super-Resolution.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.