
NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | June 5, 2024 |
Latest Amendment Date: | June 5, 2024 |
Award Number: | 2413074 |
Award Instrument: | Continuing Grant |
Program Manager: |
Yong Zeng
yzeng@nsf.gov (703)292-7299 DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences |
Start Date: | July 1, 2024 |
End Date: | June 30, 2027 (Estimated) |
Total Intended Award Amount: | $225,000.00 |
Total Awarded Amount to Date: | $74,901.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
4333 BROOKLYN AVE NE SEATTLE WA US 98195-1016 (206)543-4043 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
4333 BROOKLYN AVE NE SEATTLE WA US 98195-1016 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | STATISTICS |
Primary Program Source: |
01002526DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): | |
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
Controlling the false positive error in model selection is a prominent paradigm for gathering evidence in data-driven science. In model selection problems such as variable selection and graph estimation, models are characterized by an underlying Boolean structure, such as the presence or absence of a variable or an edge. Therefore, false positive error or false negative error can be conveniently specified as the number of variables/edges that are incorrectly included or excluded in an estimated model. However, the increasing complexity of modern datasets has been accompanied by the use of sophisticated modeling paradigms in which defining false positive error is a significant challenge. For example, models specified by structures such as partitions (for clustering), permutations (for ranking), directed acyclic graphs (for causal inference), or subspaces (for principal components analysis) are not characterized by a simple Boolean logical structure, which leads to difficulties with formalizing and controlling false positive error. A new perspective is needed to provide reliable inference in modern data analysis. The methods developed in this project have the potential to impact a wide range of fields as varied as image analysis, geosciences, computational genomics, and many others. The research will engage both graduate and undergraduate students and will be disseminated to a broader audience through the development of new courses.
In this project, the PI develops a generic framework to organize classes of models as partially ordered sets (posets), which leads to systematic approaches for defining natural generalizations of false positive error and methodology for controlling this error. The project aims to use the poset framework to address the following questions: what attributes of the poset structure determine the power and computational complexities of false positive error controlling procedures? How can we exploit specific structures in posets to design powerful model selection methods? How do we provide false discovery rate guarantees over posets? Can we utilize the framework for learning rooted phylogenetic trees and performing highly correlated variable selection?
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Please report errors in award information by writing to: awardsearch@nsf.gov.