NSF Award Search: Award # 1453261 - CAREER: Algorithmic Aspects of Machine Learning

Award Abstract # 1453261

CAREER: Algorithmic Aspects of Machine Learning

NSF Org:	CCF Division of Computing and Communication Foundations
Recipient:	MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Initial Amendment Date:	January 30, 2015
Latest Amendment Date:	August 13, 2019
Award Number:	1453261
Award Instrument:	Continuing Grant
Program Manager:	A. Funda Ergun CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering
Start Date:	July 1, 2015
End Date:	June 30, 2021 (Estimated)
Total Intended Award Amount:	$500,000.00
Total Awarded Amount to Date:	$500,000.00
Funds Obligated to Date:	FY 2015 = $196,141.00 FY 2017 = $98,920.00 FY 2018 = $101,259.00 FY 2019 = $103,680.00
History of Investigator:	Ankur Moitra (Principal Investigator) moitra@MIT.EDU
Recipient Sponsored Research Office:	Massachusetts Institute of Technology 77 MASSACHUSETTS AVE CAMBRIDGE MA US 02139-4301 (617)253-1000
Sponsor Congressional District:	07
Primary Place of Performance:	Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge MA US 02139-4301
Primary Place of Performance Congressional District:	07
Unique Entity Identifier (UEI):	E2NYLCDML6V1
Parent UEI:	E2NYLCDML6V1
NSF Program(s):	Algorithmic Foundations
Primary Program Source:	01001516DB NSF RESEARCH & RELATED ACTIVIT 01001718DB NSF RESEARCH & RELATED ACTIVIT 01001819DB NSF RESEARCH & RELATED ACTIVIT 01001920DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	1045, 7926
Program Element Code(s):	779600
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

Algorithms and complexity are the theoretical foundation and backbone of machine learning. Yet over the past few decades an uncomfortable truth has set in that worst-case analysis is not the right framework to study it: every model that is interesting enough to use in practice leads to computationally hard problems. The goal of the PI's research agenda is to move beyond worst-case analysis. This involves formalizing when and why heuristics -- such as alternating minimization and Gibbs sampling -- work as well as designing fundamentally new algorithms for some of the basic tasks in machine learning. This project has already had a number of successes such as provable algorithms for nonnegative matrix factorization, topic modeling and learning mixture models.

The PI will investigate several new directions in topics such as sparse coding, inference in graphical models, inverse problems for tensors, and semi-random models. These projects will leverage a wide range of modern tools to give new provable algorithms in each of these settings, and will involve making new connections between alternating minimization and approximate gradient descent, analyzing Gibbs sampling through correlation decay and coupling, connecting tensor completion and quantum complexity and rethinking the standard distributional models used in machine learning. These projects cut across several areas of computer science and applied mathematics and will build new bridges between them, as well as expanding the reach of theory into a number of domains where there is a serious gap in our current understanding. Bridging theory and practice has significant broader impact. The PI will continue to mentor undergraduate and graduate students.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 31)

Show All

Allen Liu and Ankur Moitra "Settling the Robust Learnability of Mixtures of Gaussians" Symposium on Theory of Computing , 2021

Allen Liu, Ankur Moitra "Better Algorithms for Estimating Non-Parametric Models in Crowd-Sourcing and Rank Aggregation" Conference on Learning Theory , 2020

Allen Liu, Ankur Moitra "Tensor Completion Made Practical" Neural Information Processing Systems , 2020

Ankur Moitra "An Almost Optimal Algorithm for Computing Nonnegative Rank" SIAM Journal on Computing , v.45 , 2016

Ankur Moitra "Approximate Counting, the Lovasz Local Lemma and Inference in Graphical Models" Symposium on Theory of Computing (STOC) , 2017

Ankur Moitra "Nonnegative Matrix Factorization: Algorithms, Complexity and Applications" International Symposium on Symbolic and Algebraic Computation (ISSAC) , 2015

Ankur Moitra, Andrej Risteski "Fast Convergence for Langevin Diffusion with Matrix Manifold Structure" Neural Information Processing Systems , 2020

Ankur Moitra, Elchanan Mossel and Colin Sandon "Learning to Sample from Censored Markov Random Fields" Conference on Learning Theory , 2021

Ankur Moitra, Elchanan Mossel, Colin Sandon "Parallels Between Phase Transitions and Circuit Complexity?" Conference on Learning Theory , 2020

Boaz Barak, Ankur Moitra, Ryan O'Donnell, Prasad Raghavendra, Oded Regev, David Steurer, Luca Trevisan, Aravindan Vijayaraghavan, David Witmer, John Wright "Beating the Random Assignment on Constraint Satisfaction Problems of Bounded Degree" Workshop on Approximation, Randomization, and Combinatorial Optimization (APPROX) , 2015

Dan Alistarh, Zeyuan Allen-Zhu, Jerry Li "Byzantine Stochastic Gradient Descent" Advances in Neural Information Processing Systems , 2018

(Showing: 1 - 10 of 31)

Show All

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The goal of this project was to design new algorithms for some fundamental problems in machine learning (described below). Additionally it contributed new perspectives in the form of (1) A new framework for analyzing non-convex optimization (2) Using semi-definite programming hierarchies to explore the statistical limits of computationally efficient algorithms for representation learning (3) A toolbox for making algorithms based on the methods of moments robust to adversarial corruptions and (4) new non-markov chain based approaches for approximate counting and inference.

Dictionary Learning: Many types of data turn out to be sparse in an appropriately chosen bases (e.g. representing signals in a wavelet basis). Modern approaches such as dictionary learning attempt to discover such a basis automatically from data by solving a challenging non-convex optimization problem. Despite their widespread use, there are limited provable guarantees. The PI and collaborators gave simple new algorithms for dictionary learning that work in the practically important overcomplete setting. Their analysis was based on a new general framework for analyzing alternating minimization by viewing it as noisy gradient descent on an unknown convex function instead.

Tensor Completion: A popular model for building recommendation systems is to view our data as coming from a low-rank matrix or tensor. While the problem of completing an unknown low-rank matrix from few observations has been intensively studied, and tight bounds are known, much less is known about completing low-rank tensors. The PI and collaborators gave the first algorithms that asymptotically improve upon the naive algorithm of completing the low-rank tensor as a collection of unrelated low-rank matrices. Furthermore the PI and collaborators showed computational vs. statistical tradeoffs based on existing conjectures about the difficulty of refuting random constraint satisfaction problems.

Robustness: In 1960, Tukey asked: Are there e cient estimators for learning the parameters of a one-dimensional Gaussian in the presence of noise? His paper launched the field of robust statistics and there are many simple, provably robust estimators in one-dimension. However they all lead to hard optimization problems in high-dimensions. In practice, this means that robust statistical methods are limited to dimension at most six. The PI and collaborators gave the first efficient algorithm for robustly learning a high-dimensional Gaussian. Later, the PI and collaborators extended this to mixtures of Gaussians. The area of algorithmic robust statistics has subsequently seen a flurry of activity, with many exciting results.

Sampling: A graphical model is a natural framework for modeling statistical dependencies for high-dimensional distributions. Most existing algorithms for sampling and inference rely on the notion of correlation decay, which stipulates that there are no long-range dependencies. In many cases, correlation decay demarcates where the problem goes from computationally easy to hard. The PI considered the problem of uniformly sampling from the set of satisfying assignments to a bounded degree instance of k-SAT. The PI gave the first algorithms that work when the degree is exponentially large in k. It gives a powerful new methodology for designing algorithms that work even when the solution space is disconnected (the freezing regime) where no local markov chain can work. Furthermore the PI gave applications to inference in a natural class of graphical models.

The project also supported the training of several graduate students and postdocs. Furthermore the PI supervised over fifty summer research projects in mathematics at the highschool and undergraduate level, including many for underrepresented minorities. The PI also wrote a book titled Algorithmic Aspects of Machine Learning which has been used as a primary text in courses run at other universities, and was recently published by Cambridge University Press.

Last Modified: 11/27/2021
Modified by: Ankur Moitra

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error