NSF Award Search: Award # 1553284

Award Abstract # 1553284

CAREER: Scalable learning with combinatorial structure

NSF Org:	IIS Division of Information & Intelligent Systems
Recipient:	MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Initial Amendment Date:	January 15, 2016
Latest Amendment Date:	June 27, 2022
Award Number:	1553284
Award Instrument:	Continuing Grant
Program Manager:	Rebecca Hwa IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	January 15, 2016
End Date:	December 31, 2022 (Estimated)
Total Intended Award Amount:	$493,059.00
Total Awarded Amount to Date:	$493,059.00
Funds Obligated to Date:	FY 2016 = $92,361.00 FY 2017 = $95,371.00 FY 2018 = $98,494.00 FY 2019 = $101,735.00 FY 2020 = $105,098.00
History of Investigator:	Stefanie Jegelka (Principal Investigator) stefje@mit.edu
Recipient Sponsored Research Office:	Massachusetts Institute of Technology 77 MASSACHUSETTS AVE CAMBRIDGE MA US 02139-4301 (617)253-1000
Sponsor Congressional District:	07
Primary Place of Performance:	Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge MA US 02139-4301
Primary Place of Performance Congressional District:	07
Unique Entity Identifier (UEI):	E2NYLCDML6V1
Parent UEI:	E2NYLCDML6V1
NSF Program(s):	Robust Intelligence
Primary Program Source:	01001617DB NSF RESEARCH & RELATED ACTIVIT 01001718DB NSF RESEARCH & RELATED ACTIVIT 01001819DB NSF RESEARCH & RELATED ACTIVIT 01001920DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	1045, 7495
Program Element Code(s):	749500
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

Advances in science and technology increasingly rely on inference and prediction from data such as videos, molecules, networks, or sets of purchased goods. Such data consists of several elements that participate in a collective structure. Effective inference and prediction in turn rely on (i) concise and accurate representations of latent interdependence structure in data; and (ii) fast learning and optimization algorithms that can process modern large data sets. This CAREER project addresses these challenges by building new algorithmic foundations that open a wider set of mathematical tools for practical data analysis.

In particular, this project explores and exploits key structural properties and representations. For example, a wide spectrum of important dependence structures (and consequently numerous learning tasks) are well captured by the ubiquitous combinatorial concept of submodular functions on sets, characterized by the property of diminishing returns. Building on this insight and other new tools, this research develops a suite of scalable optimization procedures with theoretical guarantees, as well as new tools for probabilistic modeling and fast inference. The resulting combinatorial learning methods are deployed in novel applications addressing the development of new materials, reducing environmental impact, in video analytics, and healthcare. Thereby, the proposed methods foster progress and deliver insights beyond computer science. Parts of this project are integrated into a new advanced graduate class and a new hands-on undergraduate class on data analytics that combines statistical modeling with computation, forming a core part of a new educational program. Undergraduate students are involved in the application part of the research, and selected results will serve to motivate high school students to pursue STEM careers. For the research community, the project includes interdisciplinary workshops and tutorials, and further the confluence of discrete optimization and machine learning. Educational materials, data and code are made publicly available.

The research questions of this project include three main threads:

1) Developing a set of new, scalable optimization techniques with theoretical guarantees for combinatorial learning problems. The algorithms will combine combinatorial and continuous optimization, exploit suitable mathematical properties, relaxations, and compact representations, and will implement new ways to leverage data-dependent properties that distinguish practical cases from the worst case.

2) Extending insights from optimization to probabilistic modeling and inference. This transfer will enable new models and new, fast computational procedures for sampling and probabilistic inference that exploit similar properties as the optimization algorithms.

3) Real-world applications of the new models and algorithms. Via interdisciplinary collaborations, the third thread explores new applications of combinatorial learning methods to video analytics, instruction, healthcare, and materials design.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 49)

Show All

A. Gkotovos, H. Hassani, A. Krause, S. Jegelka "Discrete Sampling using Semigradient-based Product Mixtures" Conference on Uncertainty in Artificial Intelligence (UAI) , 2018

A. Gotovos, R. Burkholz, J. Quackenbush, S. Jegelka. "Scaling up Continuous-Time Markov Chains Helps Resolve Underspecification" Neural Information Processing Systems (NeurIPS) , 2021

A. Liao, H. Zhao, K. Xu, T. Jaakkola, G. Gordon, S. Jegelka, R. Salakhutdinov "Information Obfuscation of Graph Neural Networks" International Conference on Machine Learning (ICML) , 2021

A. Loukas, M. Poiitis, S. Jegelka "What training reveals about neural network complexity" Neural Information Processing Systems (NeurIPS) , 2021

B. Mirzasoleiman, S. Jegelka, A. Krause "Streaming Non-monotone Submodular Maximization: Personalized Video Summarization on the Fly" AAAI Conference on Artificial Intelligence (AAAI) , 2018

C. Bunne, D. Alvarez Melis, A. Krause, S. Jegelka "Learning Generative Models across Incomparable Spaces" International Conference on Machine Learning (ICML) , 2019

Chengtao Li and Suvrit Sra and Stefanie Jegelka "Gaussian quadrature for matrix inverse forms with applications" International Conference on Machine Learning (ICML) , 2016

Chengtao Li, Stefanie Jegelka, Suvrit Sra "Fast DPP Sampling for Nystrom with Application to Kernel Methods" International Conference on Machine Learning (ICML) , 2016

Chengtao Li, Stefanie Jegelka, Suvrit Sra "Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling" Advances in Neural Information Processing Systems (NIPS) , 2016

Chengtao Li, Stefanie Jegelka, Suvrit Sra "Fast sampling for k-Determinantal Point Processes" Int. Conference on Artificial Intelligence and Statistics (AISTATS) , 2016

Chuang, C and Jegelka, S "Tree Mover's Distance: Bridging Graph Metrics and Stability of Graph Neural Networks" Neural Information Processing Systems (NeurIPS) , 2022 Citation Details

(Showing: 1 - 10 of 49)

Show All

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Intellectual Merit:

Advances in science and technology increasingly rely on inference and predictions from structured, discrete data such as videos, molecules, networks or sets of purchased goods. Effective inference and prediction from such data relies on concisely and accurately representing latent interdependence structure and fast learning and optimization algorithms that reliably work on modern massive data sets. This project has developed algorithmic foundations that address these challenges and open a wider set of mathematical tools for practical data analysis. In particular, the project aimed to exploit combinatorial, geometric and algebraic structure for discrete machine learning problems. It uses, combines and develops concepts from discrete optimization (the property of submodularity), negative dependence in discrete probability, optimal transport, Bayesian inference and deep learning for developing new machine learning methods and algorithms, and understanding the reliability of existing algorithms.

More concretely, under this project we obtained new results on the following topics:

(1) Submodular optimization. Submodularity is a beneficial mathematical structure. We demonstrated new ways to exploit submodularity in machine learing, for instance, to allow inferring correspondences that respect broader data geometry. We also derived new algorithms for optimizing (approximately) submodular functions, and for robust and stochastic submodular optimization with theoretical guarantees.

(2) Discrete probability. We showed new sampling and approximate inference algorithms with guarantees. In particular, stricter versions of log-submodularity include concepts of strong negative dependence of discrete random variables, which lead to beneficial sampling properties. We extend these concepts and show how to exploit them for outlier detection, sampling diverse subsets, matrix approximation, experiment design and Bayesian black-box optimization.

(3) Bayesian optimization. We developed new, particularly scalable algorithms for computing uncertainty and selecting informative subsets, i.e., for optimizing black box functions. Our methods have inspired many follow-up works.

(4) Graph representation learning. We established theoretical foundations for deep learning with graph inputs, i.e., Graph Neural Networks. We described their learning and generalization properties, within-distribution and under distribution shifts. These works have already had substantial impact in machine learning and beyond.

(5) Deep learning for combinatorial optimization. Inspired by techniques from submodular optimization and graph algorithms, we analyze the capacity of deep learning models to learn combinatorial algorithms, and we derive a framework for neural-network-friendly loss functions for discrete optimization.

These results have impact in many application areas; as part of this project, we especially collaborated on problems in materials design, robotics, sensing, computer vision and cancer genetics.

As another achievement, the project resulted in a Master's thesis and a PhD thesis that received prestigious thesis awards at ETH Zurich and MIT.

Broader Impacts:

The results of this project advance fundamental machine learning techniques, which have been used in areas as diverse as materials and drug design, robotics, traffic routing, exploiting machine learning for optimization, natural language processing, and understanding cancer, among others. Subset selection, finding correspondences, and making predictions on graph data are key problems in many applications, and all of these have the potential to benefit from these results.

To build community, as part of this project, we organized 3 workshops on non-convex optimization in machine learning and on graph representation learning. In 3 tutorials given at NeurIPS and the Simons Institute, and in 3 summer school lectures, we educated the community about new ideas of exploiting mathematical structure in machine learning, including submodularity, concepts of negative dependence, and analyses of graph representation learning. In numerous talks, the results of this project were not only presented to the core machine learning community, but also to audiences in many different fields, including Operations Research, Mathematics, Machine Learning for Science, Signal Processing.

To broaden participation in machine learning and data science and to support junior researchers, the PI served as a co-organizer or faculty advisor for the annual "Women in Data Science" workshop and for the Learning Theory Alliance, and actively participated in events as a panelist, mentor or speaker. She also supervised students, postdocs and a high school student from underrepresented minorities.

Part of this project was also the development of three new classes, including one class that became the capstone course for the undergraduate Statistics and Data Science minor at MIT.

Last Modified: 05/22/2023
Modified by: Stefanie Jegelka

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error