NSF Award Search: Award # 1637585

Award Abstract # 1637585

AitF: Collaborative Research: Algorithms for Probabilistic Inference in the Real World

NSF Org:	CCF Division of Computing and Communication Foundations
Recipient:	NORTHWESTERN UNIVERSITY
Initial Amendment Date:	August 30, 2016
Latest Amendment Date:	August 30, 2016
Award Number:	1637585
Award Instrument:	Standard Grant
Program Manager:	A. Funda Ergun CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering
Start Date:	September 1, 2016
End Date:	August 31, 2022 (Estimated)
Total Intended Award Amount:	$399,939.00
Total Awarded Amount to Date:	$399,939.00
Funds Obligated to Date:	FY 2016 = $399,939.00
History of Investigator:	Aravindan Vijayaraghavan (Principal Investigator) aravindv@northwestern.edu
Recipient Sponsored Research Office:	Northwestern University 633 CLARK ST EVANSTON IL US 60208-0001 (312)503-7955
Sponsor Congressional District:	09
Primary Place of Performance:	Northwestern University 2145 Sheridan Rd. Evanston IL US 60208-3118
Primary Place of Performance Congressional District:	09
Unique Entity Identifier (UEI):	EXZVPWZBLUE8
Parent UEI:
NSF Program(s):	Algorithms in the Field
Primary Program Source:	01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):
Program Element Code(s):	723900
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

Statistical models provide a powerful means of quantifying uncertainty, modeling prior beliefs, and describing complex dependencies in data. The process of using a model to answer specific questions, such as inferring the state of several random variables given evidence observed about others, is called probabilistic inference. Probabilistic graphical models, a type of statistical model, are often used in diverse applications such as medical diagnosis, understanding protein and gene regulatory networks, computer vision, and language understanding. On account of the central role played by probabilistic graphical models in a wide range of automated reasoning applications, designing efficient algorithms for probabilistic inference is a fundamental problem in artificial intelligence and machine learning.

Probabilistic inference in many of these applications corresponds to a complex combinatorial optimization problem that at first glance appears to be extremely difficult to solve. However, practitioners have made significant strides in designing heuristic algorithms to perform real-world inference accurately and efficiently. This project focuses on bridging the gap between theory and practice for probabilistic inference problems in large-scale machine learning systems. The PIs will identify structural properties and methods of analysis that differentiate real-world instances from worst-case instances used to show NP-hardness, and will design efficient algorithms with provable guarantees that would apply to most real-world instances. The project will also study why heuristics like linear programming and other convex relaxations are so successful on real-world instances. The efficient algorithms for probabilistic inference developed as part of this project have the potential to be transformative in machine learning, statistics, and more applied areas like computer vision, social networks and computational biology. To help disseminate the research and foster new collaborations, a series of workshops will be organized bringing together the theoretical computer science and machine learning communities. Additionally, undergraduate curricula will be developed that use machine learning to introduce students to concepts in theoretical computer science.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Awasthi, Pranjal and Chen, Xue and Vijayaraghavan, Aravindan "Estimating Principal Components under Adversarial Perturbations" Proceedings of Thirty Third Conference on Learning Theory (PMLR) , v.125 , 2020 Citation Details

Awasthi, Pranjal and Dutta, Abhratanu and Vijayaraghavan, Aravindan "On Robustness to Adversarial Examples and Polynomial Optimization" Advances in Neural Information Processing Systems 32 (NIPS 2019) , 2019 Citation Details

Awasthi, Pranjal and Vijayaraghavan, Aravindan "Clustering Semi-Random Mixtures of Gaussians" Proceedings of Machine Learning Research , 2017 Citation Details

Lang, Hunter and Reddy, Aravind and Sontag, David and Vijayaraghavan, Aravindan "Beyond Perturbation Stability: LP Recovery Guarantees for MAP Inference on Noisy Stable Instances" Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR , 2021 Citation Details

Lang, Hunter and Sontag, David and Vijayaraghavan, Aravindan "Block Stability for MAP inference" Proceedings of Machine Learning Research , v.86 , 2019 Citation Details

Lang, Hunter and Sontag, David and Vijayaraghavan, Aravindan "Graph cuts always find a global optimum for Potts models (with a catch)" Proceedings of the Thirty-eighth International Conference on Machine Learning (ICML) , 2021 Citation Details

Lang, Hunter and Sontag, David and Vijayaraghavan, Aravindan "Optimality of Approximate Inference Algorithms on Stable Instances" Proceedings of Machine Learning Research , v.84 , 2018 Citation Details

Regev, Oded and Vijayaraghavan, Aravindan "On Learning Mixtures of Well-Separated Gaussians" Proceedings of 58th Annual IEEE Symposium on the Foundations of Computer Science , 2017 10.1109/FOCS.2017.17 Citation Details

Vijayaraghavan, Aravindan and Dutta, Abhratanu and Wang, Alex "Clustering Stable Instances of Euclidean k-means" Proceedings of the Neural Information Processing Systems (NIPS) , 2017 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Probabilistic models provide a powerful means of quantifying uncertainty, modeling prior beliefs, and describing complex dependencies in data. These models are frequently used to make predictions in settings with limited training data such as in health care (medical diagnosis from symptoms), biology (inferring protein regulatory networks), computer vision (stereopsis), and natural language processing (parsing and text generation). Using these models requires performing probabilistic inference, which takes evidence about observed variables to draw conclusions about unobserved variables. Inference in these applications corresponds to a complex combinatorial optimization problem that at first glance appears to be extremely difficult to solve. However, practitioners have made significant strides in designing heuristic algorithms to perform real-world inference accurately and efficiently. This project focused on algorithms and theory for probabilistic inference problems in large-scale machine learning systems.

A major contribution of this project was bridging the gap between theory and practice for finding the most probable assignment of the unobserved variables (MAP inference) in a special case of probabilistic models (Potts models). Heuristics based on iterative techniques (alpha-expansion) and linear programming relaxations have been shown empirically to return solutions that are either globally optimal, or close to globally optimal. PIs David Sontag and Aravindan Vijayaraghavan with PhD students Hunter Lang (MIT) and Aravind Reddy (Northwestern) showed through both theoretical and empirical results published in a series of works (AISTATS 2018, AISTATS 2019, ICML 2021, AISTATS 2021) that variants of a natural notion called stability could explain the tractability of these instances in practice.

Specifically, for the alpha-expansion algorithm, we proved (in Lang et al., ICML 2021) that it always returns a globally optimal assignment for Potts models, with a catch: the returned assignment is only guaranteed to be optimal for an instance within a small perturbation of the original problem instance. Intuitively, we expect that for robust models used in applications, small perturbations to their parameters should have similar optimal assignments, which would imply that alpha-expansion provides near-optimal assignments for the original models. Moreover, for a linear programming relaxation, we proved that it is close to integral when the instance is close to stable or satisfies a condition called block stability (AISTATS 2019, AISTATS 2021). We then designed algorithms that can certify whether models satisfy these conditions. Using our new methods for certifying stability, we were able to demonstrate that several computer vision models indeed satisfy these stability conditions. Taken together, these results give a cohesive explanation for the good performance of the above heuristics in practice.

Our results also suggested a monotonicity principle for algorithm design for the linear programming relaxation: we showed that the relaxation performs well when the input is generated as a simpler tractable part of the model and remaining factors are random but slightly biased toward the ground truth.

The project also led to theory-inspired improvements for emerging paradigms in machine learning like weak supervision, where one seeks to learn from imperfect labels that domain experts provide (e.g., that any review with the word ?great? should be labeled as positive sentiment). Under the assumption of conditional independence between the weak labels and the features, in joint work of the PIs with MIT PhD student Hunter Lang (published in NeurIPS 2022), we proved that there is a trade-off between the number of data points ?covered? by the weak labels and their noise rates. We then showed that we could use the popular ?cut statistic? heuristic to rank data points by a measure of the confidence in their weak labels; finally, we train the downstream classifier on the most confident of these. Surprisingly, we found that this technique improved the performance of every single weak supervision method previously published, often by large amounts.

Other research highlights from this project include new theory and algorithms for clustering Euclidean instances that are stable (Dutta et al., NeurIPS ?17), new algorithms with provable guarantees for learning depth-2 neural networks (Awasthi et al., NeurIPS ?21), improved algorithms using ideas from co-training for prompt-based learning with large language models and unlabeled data (Lang et al., ICML ?22), as well as new algorithms and provable guarantees for human-AI interaction (Mozannar et al., AAAI '22) and self-supervised learning tasks that arise in healthcare (Agrawal et al., AISTATS ?22).

The PIs also organized activities and workshops with the goal of reaching out to the broader scientific community interested in topics related to the project. PI Vijayaraghavan co-organized (along with Aleksander Madry and Daniel Hsu) the FOCS 2021 workshop on New Directions in Machine Learning. PI Vijayaraghavan also co-organized several day-long theory workshops at Northwestern University that got broad attendance from theory faculty, postdocs and students in many mid-west universities in and near Chicago. Two of these workshops were closely related to the topic of this project. These workshops were open to the public, and videos are publicly accessible on the website.

Last Modified: 01/20/2023
Modified by: Aravindan Vijayaraghavan

Images (1 of 3)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error