
NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | May 30, 2012 |
Latest Amendment Date: | July 30, 2014 |
Award Number: | 1208896 |
Award Instrument: | Continuing Grant |
Program Manager: |
Gabor Szekely
DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences |
Start Date: | September 1, 2012 |
End Date: | August 31, 2016 (Estimated) |
Total Intended Award Amount: | $149,991.00 |
Total Awarded Amount to Date: | $149,991.00 |
Funds Obligated to Date: |
FY 2013 = $61,331.00 FY 2014 = $57,758.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
1 UTSA CIR SAN ANTONIO TX US 78249-1644 (210)458-4340 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
One UTSA Circle San Antonio TX US 78249-1644 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | STATISTICS |
Primary Program Source: |
01001314DB NSF RESEARCH & RELATED ACTIVIT 01001415DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): | |
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
Statistical methods for the analysis of spatial discrete data are relatively underdeveloped when compared to methods for continuous data. This is a notable methodological gap since the former are routinely collected in the earth and social sciences. For instance, death counts due to different causes are collected on a regular basis by government agencies throughout the entire U.S. and classified according to different demographic variables, such as age, gender and race. This project aims at filling this gap by developing a comprehensive study of models for geostatistical discrete data. The project consists of three parts. First, a class of hierarchical spatial models is developed that seeks to ameliorate some limitations identified by the investigator of currently used models. Some of these limitations, relating to the spatial association structures representable by these models, are especially severe when the data consist mostly of small counts, precisely the case when models describing the discreteness of the data are most needed. The properties of these new models and likelihood based methods to fit them are studied. Second, a class of non-hierarchical spatial models is developed that seeks to represent a wide range of spatial discrete data, not just counts, having spatial association structures that are complementary to those in the class of hierarchical spatial models. The models in this class are constructed by separately modeling the marginal and spatial association structures, using an approach akin to copulas. The properties of these models and likelihood based methods to fit them are also studied. Third, a recently proposed Bayesian method to assess goodness-of-fit of statistical models is studied and its soundness for use in the aforementioned classes of models explored. The method, based on a distributional identity between pivotal quantities evaluated at different parameter values, is applicable to both hierarchical and non-hierarchical models. Developing such methods is a pressing need since formal methods to assess model adequacy of spatial models are notoriously lacking.
Spatial data are nowadays routinely collected in many earth and social sciences, such as ecology, epidemiology, demography and geography, but methodology for the analysis of discrete data (say death counts) is much less developed than the corresponding methodology for the analysis of continuous data (say temperature). The investigator proposes to fill this gap by constructing new classes of models that on the one hand ameliorate some limitations identified by the investigator of currently used models, and on the other hand increase the data patterns represented by the models. The project will also develop methodology to assess model adequacy for the newly proposed models, a ubiquitous task in science since any model is an imperfect representation of the phenomenon under study. The statistical methodology developed in the course of this project would have immediate methodological and practical impacts on the earth and social sciences, where spatial discrete data are routinely collected but models and methods for their analysis are scarce. The proposed classes of models will substantially increase the arsenal of tools available to spatial data analysts and the possibility of representing a wide range of behaviors for spatial discrete data. Graduate students will be engaged in the project which will contribute to their statistical training in Bayesian methods and Spatial Statistics, as well as the projection into the future of the Ph.D. program in Applied Statistics at the University of Texas at San Antonio.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The outcomes of this project increase the understanding of current models for the analysis of geostatistical count data. It was shown that the dependence structure of hierarchical Poisson models, which are arguably the current `state-of-the-art' for modeling and analysis of geostatistical count data, is limited in a way that makes these models unable to describe datasets that consist mostly of small counts displaying substantial spatial correlation. On the other hand, the dependence structure of Gaussian copula models, a less known and used class of models, is much more flexible. So the latter class of models is a useful alternative to the more comonly used hierarchical Poisson models for the analysis of geostatistical count data. Both classes of models are uselful options for the analysis of this type of data, and the choice between them should be based on the inferential goals and nature of the data.
Also, the simpler model called Poisson kriging was revisited and its properties explored. This model is simpler than the aforementioned models, and offers an attractive alternative for practical scientists with limited statistical training, because of its simplicity and similarity with popular geostatistical methods.
Two R packages were developed that implement statistical methodology for the analysis of the models studied in this project. The package `geoCount' develops Markov chain Monte Carlo algorithms for the Bayesian analysis of hierarchical Poisson models, while the package `gcKrig' develops Monte Carlo algorithms for the frequentist analysis of Gaussian copula models. Both were submitted to CRAN so they are publicly available.
Finally, a Ph.D. student was trained in the course of this project.
Last Modified: 11/25/2016
Modified by: Victor Deoliveira
Please report errors in award information by writing to: awardsearch@nsf.gov.