
NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | July 18, 2023 |
Latest Amendment Date: | July 18, 2023 |
Award Number: | 2310955 |
Award Instrument: | Standard Grant |
Program Manager: |
Yong Zeng
yzeng@nsf.gov (703)292-7299 DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences |
Start Date: | August 1, 2023 |
End Date: | July 31, 2026 (Estimated) |
Total Intended Award Amount: | $200,000.00 |
Total Awarded Amount to Date: | $200,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
575 LEXINGTON AVE FL 9 NEW YORK NY US 10022-6145 (646)962-8290 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
1300 YORK AVE NEW YORK NY US 10065-4805 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
STATISTICS, MATHEMATICAL BIOLOGY |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
This project aims to develop a new methodology for selecting key features among a large pool of potential variables that are predictive of the final outcomes. When applied to the biomedical field, these methods will enable the discovery of determinants of patient health, thus improving the prevention, treatment, and management of diseases. When used in fields such as engineering, psychology, sociology, economics, and environmental sciences, these methods can improve manufacturing processes, social programs that focus on diversity and equity, the care and management of mental health, and the preservation of the environment and natural resources. Additionally, the new methods will also help to generate high-quality synthetic data while maintaining the confidentiality of the original information, thereby spurring new scientific discoveries and providing a valuable educational tool. The project will offer a number of unique interdisciplinary training initiatives for the future cohorts of data scientists at the interface of statistics, machine learning, and biomedical sciences.
The research agenda is based on the 'knockoff method' for identifying key features predictive of the outcomes while maintaining false discovery control. The methods incorporate the microbiome phylogenetic structure in feature selection, accommodate missing values, incorporate multiple knockoffs to increase robustness, employ nonparametric Bayesian models for complex data structures, and introduce a new knockoff statistic based on conditional prediction function. The proposed statistics can be paired with state-of-the-art machine learning models to detect nonlinear relationships while accounting for feature correlation. Furthermore, by applying knockoff filtering with unsupervised learning models, this research can identify determinants of the feature space and provide insights into unsupervised clustering and learning.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Please report errors in award information by writing to: awardsearch@nsf.gov.