
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | February 2, 2021 |
Latest Amendment Date: | September 20, 2023 |
Award Number: | 2047933 |
Award Instrument: | Continuing Grant |
Program Manager: |
Peter Brass
pbrass@nsf.gov (703)292-0000 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | February 15, 2021 |
End Date: | March 31, 2024 (Estimated) |
Total Intended Award Amount: | $599,586.00 |
Total Awarded Amount to Date: | $498,141.00 |
Funds Obligated to Date: |
FY 2023 = $194,110.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
5000 FORBES AVE PITTSBURGH PA US 15213-3815 (412)268-8746 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
PA US 15213-3815 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Algorithmic Foundations |
Primary Program Source: |
01002425DB NSF RESEARCH & RELATED ACTIVIT 01002324DB NSF RESEARCH & RELATED ACTIVIT 01002526DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The recent surge in the applications of machine learning is powered by algorithms that learn hidden patterns in large volumes of data. Designing faster and more reliable data analysis algorithms is a key challenge in broadening the scope of such applications. However, researchers have realized that the classical framework of algorithm design is inadequate for this task. This is because large data in almost every application is modeled using statistical models as opposed to the standard worst-case model used in algorithm design. Consequently, central challenges that involve an interplay between algorithms and statistically generated data remain widely unresolved not just in machine learning but also in statistical physics and cryptography. This project will address this critical deficiency by building a principled theory of algorithm design for statistical (aka average-case) data. The new paradigms explored in this work will unify the currently fragmented set of approaches for studying average-case computation. The curriculum development plan outlined in this project will train the next generation of scientists in the algorithmic methods tailor-made for problems in large scale statistical data analysis and disseminate the modern paradigms for understanding computation to both graduate and undergraduate students.
Average-case complexity is a central thrust in the theory of computation with a direct impact on potential technological advances in machine learning and cryptography as well as basic questions in statistical physics. Examples include training expressive statistical models such as Gaussian mixture models and Sparse PCA to find patterns in large data in machine learning, ascertaining the security of pseudo-random generators in cryptography, and finding the lowest-energy states of spin-glass systems in statistical physics. Our current understanding of such problems is based on fragmented, domain-specific algorithmic schemes such as statistical query methods and method of moments (in machine learning), belief propagation (in statistical physics), and semidefinite programming hierarchies (in computational complexity). This project is devoted to building a unified theory of average-case computation that offers new tools to design better algorithms, prove sharp lower-bounds, and allow rigorously transferring insights between different specific frameworks. This investigation will build new bridges between theoretical computer science and several adjacent areas including machine learning, statistical physics, algebraic geometry, and probability. In addition, it will further develop the burgeoning understanding of the sum-of-squares semidefinite programming hierarchy, mixture models, and use of solution-space geometry in solving random constraint satisfaction problems.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.