NSF Award Search: Award # 1657613

Award Abstract # 1657613

CRII: RI: Inference for Probabilistic Programs: A Symbolic Approach

NSF Org:	IIS Division of Information & Intelligent Systems
Recipient:	UNIVERSITY OF CALIFORNIA, LOS ANGELES
Initial Amendment Date:	February 21, 2017
Latest Amendment Date:	February 21, 2017
Award Number:	1657613
Award Instrument:	Standard Grant
Program Manager:	Rebecca Hwa IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	March 1, 2017
End Date:	February 28, 2019 (Estimated)
Total Intended Award Amount:	$174,639.00
Total Awarded Amount to Date:	$174,639.00
Funds Obligated to Date:	FY 2017 = $174,639.00
History of Investigator:	Guy Van den Broeck (Principal Investigator) guyvdb@cs.ucla.edu
Recipient Sponsored Research Office:	University of California-Los Angeles 10889 WILSHIRE BLVD STE 700 LOS ANGELES CA US 90024-4200 (310)794-0102
Sponsor Congressional District:	36
Primary Place of Performance:	UCLA Computer Science Dept. 420 Westwood Plaza, BH 4531E Los Angeles CA US 90095-1596
Primary Place of Performance Congressional District:	36
Unique Entity Identifier (UEI):	RN64EPNH8JC6
Parent UEI:
NSF Program(s):	CRII CISE Research Initiation
Primary Program Source:	01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7495, 8228
Program Element Code(s):	026Y00
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

Probabilistic machine learning and artificial intelligence have revolutionized the world and are present in most aspects of our life. However, the tools used to develop probabilistic machine learning solutions are limited in what they can express. Moreover, they require significant expert knowledge, and are not accessible to scientists in each discipline, let alone everybody else. Probabilistic programming aims to make probabilistic machine learning accessible to all, and as easy to program as a phone application. To make this dream a reality, probabilistic program execution, making probabilistic predictions from observations, has to become as highly efficient and robust as our current non-probabilistic software tools. This project develops general-purpose algorithms to execute probabilistic programs efficiently, using advanced symbolic reasoning techniques from artificial intelligence. Moreover, it does so for probabilistic programs that are significantly more complex than the ones in use today, involving a wide range of programming language features that are both discrete and continuous. This increase in scalability and expressive power will foster novel, increasingly advanced machine learning applications.

More specifically, probabilistic programs subsume classical probabilistic graphical models and are additionally able to capture complex probabilistic dependencies that include arbitrary pieces of executable code. While many expressive probabilistic programming languages have been proposed in recent years, the current bottleneck and barrier to success is the lack of general-purpose reasoning algorithms to perform inference with probabilistic programs efficiently. This research tackles two key problems in probabilistic program inference. First, current sampling-based algorithms have problems reasoning about dependencies between large numbers of discrete random variables and explaining low-probability observations. In one thrust, this project develops new inference algorithms based on knowledge compilation. This technique compiles the program into a symbolic structure that is efficient for probability computation. The algorithm does not compile the entire program, which is generally intractable, but uses importance sampling on partially compiled programs to sample efficient subprograms. This combines the best of approximate program evaluation by sampling with highly efficient compilation techniques for exact inference. Second, symbolic approaches to inference are fundamentally discrete and have problems dealing with continuous and integer variables, which frequently appear in real code. Conversely, algorithms for continuous distributions cannot efficiently handle discrete program structure. In another thrust, this project studies symbolic approaches to probabilistic reasoning in programs with both types of structure, using recent breakthroughs based on satisfiability modulo theories and hashing-based sampling. This project provides a scientific leap at a fundamental level. It also provides a context for training undergraduate and graduate students in subjects spanning machine learning, artificial intelligence, statistics, and programming languages, and targets the integration of probabilistic programming into computer science curricula.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Choi, Y. and Van den Broeck, G. "On Robust Trimming of Bayesian Network Classifiers" Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI) , 2018 Citation Details

Choi, YooJung and Darwiche, Adnan and Van den Broeck, Guy "Optimal Feature Selection for Decision Robustness in Bayesian Networks" Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI) , 2017 10.24963/ijcai.2017/215 Citation Details

Friedman, T. and Van den Broeck, G. "Approximate Knowledge Compilation by Online Collapsed Importance Sampling" Advances in Neural Information Processing Systems 31 (NeurIPS) , 2018 Citation Details

Holtzen, S. and Van den Broeck, G. and Millstein, T. "Sound Abstraction and Decomposition of Probabilistic Programs" Proceedings of the 35th International Conference on Machine Learning (ICML) , 2018 Citation Details

Holtzen, Steven and Millstein, Todd and Van den Broeck, Guy "Probabilistic Program Abstractions" Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI) , 2017 Citation Details

Kabir, Shahroze and Sala, Frederic and Van den Broeck, Guy and Dolecek, Lara "Coded machine learning: Joint informed replication and learning for linear regression" 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton) , 2017 10.1109/ALLERTON.2017.8262880 Citation Details

Khosravi, P. and Liang, Y. and Choi, Y. and Van den Broeck, G. "What to Expect of Classifiers? Reasoning about Logistic Regression with Missing Features" Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI) , 2019 Citation Details

Latour, Anna and Babaki, Behrouz and Dries, Anton and Kimmig, Angelika and Van den Broeck, Guy and Nijssen, Siegfried "Combining Stochastic Constraint Optimization and Probabilistic Programming: From Knowledge Compilation to Constraint Solving" Proceedings of the 23rd International Conference on Principles and Practice of Constraint Programming (CP) , 2017 10.1007/978-3-319-66158-2_32 Citation Details

Liang, Y. and Van den Broeck, G. "Learning Logistic Circuits" Proceedings of the 33rd Conference on Artificial Intelligence (AAAI) , 2019 Citation Details

Liang, Yitao and Bekker, Jessa and Van den Broeck, Guy "Learning the Structure of Probabilistic Sentential Decision Diagrams" Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI) , 2017 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Probabilistic machine learning and artificial intelligence have revolutionized the world and are present in most aspects of our life. Yet they are limited in what they can express, and are not accessible to domain experts, let alone everybody else. Probabilistic programming aims to make probabilistic machine learning accessible to all, and as easy to program as a phone application. To make this dream a reality, probabilistic program inference has to become as highly efficient and robust as our current non-probabilistic software compilers.

This project has produced several novel algorithms for probabilistic programming. We have focussed our activities on solving probabilistic reasoning tasks that are strictly harder than the typical ones. We sought to do symbolic probabilistic inference for making decisions on top of a complex probability distribution. For example, our algorithm selects medical tests to run in order so make the same diagnosis as if the results of all possible tests were observed. We have also included constraints and rewards in these decision-making objectives, and support for computing maximum expected utilities.

This project has produced new techniques to reason about probabilistic programs, specifically through the use of abstractions. In this framework, the concrete program is captured by a small number of properties, called predicates. Using these predicates, we construct an abstract probabilistic program over only these predicates, yielding a simpler probabilistic program to reason about. We prove theoretical relationships between such probabilistic predicate abstractions and concrete probabilistic programs, and illustrate how they are useful for speeding up the execution of such programs. In addition, this project has produced novel approximation algorithms that combine exact symbolic reasoning with approximate reasoning about uncertainty.

Another major outcome has been to expand the applications of discrete probabilistic inference by looking a machine learning problems. The project has shown how probabilistic world models, such as probabilistic programs, can be used to improve the performance of pure function-based classifiers that are trained for a specific prediction task. Probabilistic models capture all dependencies between a large number of features of the world, regardless of the final prediction task. We have developed algorithms that show how that large joint distribution can be useful when our goal is a specific task, such as classifying images from the pixels. Specifically, our novel algorithms were shown to be more data efficient, and able to handle missing data at prediction time in ways that outperforms the state of the art.

Development of the algorithms and systems described above have provided hands-on training for many of the PhD students involved, as well as several undergraduate and Master students who have built on this code for their individual research projects. The project was also crucial to the curriculum development at UCLA, where several graduate courses were introduced on the topic of this project. The research results have been disseminated as full research papers at top-tier conferences, and as open-source software.

Last Modified: 07/16/2019
Modified by: Guy Van Den Broeck

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error