
NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | September 12, 2016 |
Latest Amendment Date: | September 12, 2016 |
Award Number: | 1622444 |
Award Instrument: | Standard Grant |
Program Manager: |
Christopher Stark
DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences |
Start Date: | September 15, 2016 |
End Date: | August 31, 2021 (Estimated) |
Total Intended Award Amount: | $199,920.00 |
Total Awarded Amount to Date: | $199,920.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
1608 4TH ST STE 201 BERKELEY CA US 94710-1749 (510)643-3891 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
ESPM, 130 Mulford Hall Berkeley CA US 94720-3114 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
CDS&E-MSS, CDS&E |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
Hierarchical statistical models allow analysis of patterns in complex data while accounting for relationships such as temporal or spatial patterns or shared sampling units. A great variety of analysis algorithms for hierarchical models have been developed by statistical researchers but are unavailable to practitioners such as social scientists and biologists. The NIMBLE software platform was developed to bridge this gap and make it easier for scientists to use a variety of algorithms on their specific datasets. In particular NIMBLE provides a programming environment in which researchers can implement algorithms that can then be easily used by others in the context of specific datasets. The work under this project will extend NIMBLE to provide computational methods for working with very flexible statistical methods known as Bayesian nonparametric methods. These methods allow researchers to summarize variables and quantify relationships between different variables in an analysis while making fewer assumptions than standard statistical approaches. While Bayesian nonparametric methods have developed substantially in the last 10-15 years, many of these methods are hard or time-consuming for those working with data to implement on their own. This project will implement many such methods in the NIMBLE software, thereby providing them to practitioners to use in their day-to-day analyses. Moreover, it will provide a foundation for ongoing development and sharing of new and improved such methods in the future.
A large amount of research aims to improve the intertwined statistical and computational methods for analysis of hierarchical statistical models. Such research is important because problem-specific hierarchical models facilitate rapid advances in many scientific fields. However, statistical researchers have lacked a flexible software platform designed for programming and disseminating the many varieties of algorithms such as Markov chain Monte Carlo, sequential Monte Carlo, and methods that build upon them. The NIMBLE system provides such a software platform. This project helps to further fill that gap by extending the NIMBLE system to enable use of Bayesian nonparametric methods, with a focus on nonparametric mixture models, of which the Dirichlet process model and related models are widely-known. This extension will allow routine application of these nonparametric mixture models as prior distributions for parts of arbitrary hierarchical models. The project will implement a variety of techniques for fitting Bayesian nonparametric mixtures, focusing on both collapsed and blocked samplers in Markov chain Monte Carlo algorithms. Such techniques methods have been highly developed by specialists but are limited in their research and scientific applications by lack of general implementation.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
This project supported implementation of Bayesian nonparametric statistical methods within NIMBLE, a general purpose software platform for computational statistical methods with general hierarchical models. In many real-world settings, analysis of data is complicated by the fact that the data are not all independent, and there may be many explanatory variables or experimental treatments of interest. Hierarchical statistical models are commonly used as a way to account for such complicated relationships in a data set and thereby attain valid analysis results. However, while hierarchical models can represent complicated relationships, they are limited by their reliance on commonly-used probability distributions (such as the normal distribution, Poisson distribution, and gamma distribution), which may not be flexible enough to capture the underlying variation in either outcome variables or intermediate variables in a model. One can specify mixtures (i.e., weighted averages) of such distributions for greater flexibility, but this relies on choosing the number of components of the mixture in advance. Bayesian nonparametric methods allow one to flexibly specify distributions without having to pre-specify the number of components, learning the number of components from the data. In doing so, they are also useful for clustering data, which can aid in interpretation. By allowing analysts to avoid overly rigid assumptions about the distributional forms used in their model, one can have more confidence in the robustness of the results of Bayesian hierarchical models.
Bayesian nonparametric (BNP) methods have been widely developed in the statistical literature over the past five decades, and there is some specialized software for fitting BNP models. However, one could not previously use BNP methods easily in existing general purpose software for hierarchical models. This project added BNP methods to the NIMBLE software. The NIMBLE software package is an R package that includes three major components: a new implementation of the BUGS language for writing hierarchical statistical models; a domain-specific language embedded within R for programming algorithms to use with hierarchical models; and an algorithm library including methods for Markov chain Monte Carlo, sequential Monte Carlo, and Monte Carlo Expectation Maximization, among others. NIMBLE users can now easily specify flexible BNP-based mixture distributions within their models. The project also added the ability for NIMBLE to automatically provide MCMC sampling algorithms for the BNP-related components of a model, as part of an overall MCMC algorithm that one uses to fit the model. As an example of the power of having BNP methods within general software, an analyst can now easily compare a standard model for their data, without the use of BNP methods, to a version of the model that includes BNP methods, thereby avoiding assumptions that may be overly restrictive and could reduce the robustness of the scientific results of the analysis.
Last Modified: 12/29/2021
Modified by: Christopher J Paciorek
Please report errors in award information by writing to: awardsearch@nsf.gov.