Award Abstract # 1811779
Exact and Asymptotic Distribution Theory for General Gaussian Processes

NSF Org: DMS
Division Of Mathematical Sciences
Recipient: WILLIAM MARSH RICE UNIVERSITY
Initial Amendment Date: May 11, 2018
Latest Amendment Date: May 11, 2018
Award Number: 1811779
Award Instrument: Standard Grant
Program Manager: Pena Edsel
DMS
 Division Of Mathematical Sciences
MPS
 Directorate for Mathematical and Physical Sciences
Start Date: July 1, 2018
End Date: June 30, 2022 (Estimated)
Total Intended Award Amount: $250,000.00
Total Awarded Amount to Date: $250,000.00
Funds Obligated to Date: FY 2018 = $250,000.00
History of Investigator:
  • Philip Ernst (Principal Investigator)
    philip.ernst@rice.edu
  • Frederi Viens (Co-Principal Investigator)
Recipient Sponsored Research Office: William Marsh Rice University
6100 MAIN ST
Houston
TX  US  77005-1827
(713)348-4820
Sponsor Congressional District: 09
Primary Place of Performance: William Marsh Rice University
6100 Main St
Houston
TX  US  77005-1827
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): K51LECU1G8N3
Parent UEI:
NSF Program(s): STATISTICS
Primary Program Source: 01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1269
Program Element Code(s): 126900
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.049

ABSTRACT

This project will further the development of exact and asymptotic distribution theory for Gaussian processes and their quadratic forms. While modern advances in data science owe much progress to computational methods and the rapid growth in computer technology, statistics and applied probability are rife with examples where a careful mathematical analysis allows discoveries that no amount of computational power can uncover. This project is one such example and will use the PI's work on Yule's so called "nonsense" correlation, a 90-year old open problem that was solved last year via mathematical analysis tools. This explicit calculation showed the precise scale of the apparent correlation between two independent continuous series of data, such as what one encounters in economics, climate science, finance, and many other fields. This mathematical explanation of an apparent statistical paradox will enable the investigation of other important questions in mathematical statistics. The project will investigate a possible connection between some important open questions and a set of tools in probability theory whose power mathematical statisticians have only begun to investigate. The project will provide fertile ground for statistics graduate student training at Rice and Michigan State Universities; students will benefit from a wide scope of opportunities, from rigorous study of mathematical tools, to their use in statistics, to applications in fields of great societal value.

This project will investigate the probability law of the Pearson correlation between two independent or dependent Gaussian processes. Analyses of distributions in the second Wiener chaos (quadratic forms of normals) are a new set of tools that will be brought to bear. Those tools are flexible enough to handle any Gaussian process via their so-called Karhunen-Loeve expansions. In terms of applications, what is most striking is that any statistical estimation or test based on these projected studies would only require a single or a pair of observations; this is particularly useful for situations, such as in environmental statistics or in economics, where experiments cannot be designed, and one has to work with the available observable data collected dynamically in time. The second emphasis in this study, on Polya frequency functions and related densities, uses some of the same mathematical tools, thanks to a realization that the densities can be represented and expanded explicitly in the second Wiener chaos. The project seeks to prove when a density is strongly log-concave (e.g. its logarithm has a second derivative which is bounded away from zero.) This question, which in mathematical statistics is phrased more broadly in terms of Polya frequency functions, has distribution of sums of independent and non-identically distributed exponentials, expands to the case of general second-chaos distributions. The project could have important consequences in the practice of statistics, especially in areas where comparing non-trivial time series is a challenge, and in many scientific fields informed by properties of log-concavity and strong log-concavity.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 27)
Araya, Héctor and Bahamonde, Natalia and Torres, Soledad and Viens, Frederi "Donsker type theorem for fractional Poisson process" Statistics & Probability Letters , v.150 , 2019 10.1016/j.spl.2019.01.036 Citation Details
D Ikpe, R Mawonike "Static Markowitz mean-variance portfolio selection model with long-term bonds" Numerical Algebra, Control and Optimization , v.0 , 2022 https://doi.org/10.3934/naco.2022030 Citation Details
Douissi, Soukaina and Es-Sebaiy, Khalifa and Alshahrani, Fatimah and Viens, Frederi G. "AR(1) processes driven by second-chaos white noise: BerryEsséen bounds for quadratic variation and parameter estimation" Stochastic Processes and their Applications , 2020 https://doi.org/10.1016/j.spa.2020.02.007 Citation Details
Douissi, Soukaina and Es-Sebaiy, Khalifa and G. Viens, Frederi "Berry-Esséen bounds for parameter estimation of general Gaussian processes" Latin American Journal of Probability and Mathematical Statistics , v.16 , 2019 10.30757/ALEA.v16-23 Citation Details
Douissi, Soukaina and Es-Sebaiy, Khalifa and Viens, Frederi "Asymptotics of Yules nonsense correlation for Ornstein-Uhlenbeck paths: A Wiener chaos approach" Electronic Journal of Statistics , v.16 , 2022 https://doi.org/10.1214/22-EJS2021 Citation Details
Ernst, P. A. and Peskir, G. and Zhou, Q. "Optimal real-time detection of a drifting Brownian coordinate" The Annals of Applied Probability , v.30 , 2020 https://doi.org/10.1214/19-AAP1522 Citation Details
Ernst, Philip A. and Asmussen, Søren J. and Hasenbein, John "Stability and busy periods in a multiclass queue with state-dependent arrival rates" Queueing Systems , v.90 , 2018 10.1007/s11134-018-9587-9 Citation Details
Ernst, Philip A. and Franceschi, Sandro "Asymptotic behavior of the occupancy density for obliquely reflected Brownian motion in a half-plane and Martin boundary" The Annals of Applied Probability , v.31 , 2021 https://doi.org/10.1214/21-AAP1681 Citation Details
Ernst, Philip A. and Franceschi, Sandro and Huang, Dongzhou "Escape and absorption probabilities for obliquely reflected Brownian motion in a quadrant" Stochastic Processes and their Applications , v.142 , 2021 https://doi.org/10.1016/j.spa.2021.06.003 Citation Details
Ernst, Philip A. and Imerman, Michael B. and Shepp, Larry and Zhou, Quan "Fiscal stimulus as an optimal control problem" Stochastic Processes and their Applications , 2021 https://doi.org/10.1016/j.spa.2021.05.009 Citation Details
Ernst, Philip A. and Kagan, Abram M. and Rogers, L.C.G. "The least favorable noise" Electronic Communications in Probability , v.27 , 2022 https://doi.org/10.1214/22-ECP467 Citation Details
(Showing: 1 - 10 of 27)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project has built the mathematical foundations for designing the first demonstrably correct statistical tests for testing independence for pairs of paths of Gaussian processes: Wiener processes, Ornstein-Uhlenbeck (OU) processes, fractional Ornstein-Uhlenbeck (fOU), and fractional Brownian motion (fBm). The importance of constructing such tests is motivated by our 2017 paper on Yule's so called "nonsense" correlation, in which we provide the precise scale of the apparent correlation between two independent continuous series of data, such as what one encounters in economics, climate science, finance, and many other fields.

 

We embarked on this proposal by developing theory and methodology for calculating all moments (up to order 16) of Yule’s “nonsense” correlation for two independent Wiener processes. This allows us to provide the first density approximation to Yule's “nonsense” correlation. The methodology we develop is broad in spirit, allowing us to work conclusively in all settings where the Gaussian process arises as the solution of a linear stochastic differential equation (SDE). We then employ these methods to explicitly calculate the moments of the empirical correlation for two correlated Brownian motions (with correlation coefficient), two independent Ornstein-Uhlenbeck processes, and two independent Brownian bridges. Establishing unequivocal mathematical facts which are simple to explain, such as providing calculations of second moments of empirical correlation for the processes considered above, should bear enormous potential in  popularizing the risks associated with widespread misinterpretation of Pearson correlation coefficients.

We then explore the following question: what is the distribution of the empirical correlation for two independent Gaussian random walks? This question is of interest not least because discrete stochastic process data (for example, time series data) occur most frequently and extensively in the real world. A test statistic for discrete processes is thus easier for practitioners to apply than that for continuous stochastic processes. Studying the discrete data test statistic directly is also a means of minimizing the risk of using the continuous statistic abusively when the discrete-data situation is not sufficiently well approximated by a continuous-data one. In this vein, we succeed in providing an exact formula for the second moment of the empirical correlation of two independent Gaussian random walks (as well as implicit formulas for higher moments). We also provide rates of convergence of the empirical correlation of two independent Gaussian random walks to the empirical correlation of two independent Wiener processes, and explicit upper bounds (in terms of the Wasserstein distance) are given.

We proceed to work in statistical inference for discrete-time second chaos processes, as well as for Gaussian (discrete-time) processes. We compute the quadratic variations of all AR(1) stationary time series in the second chaos, and estimate their normal speeds of convergence in total variation. In addition to working with discrete-time second chaos processes, we consider basic objects in the fourth and second Wiener chaos which are directly relevant to their statistical properties, and moreover, these objects are statistics of the processes' entire paths. Understanding how these objects behave asymptotically will be a necessary first step in achieving quantitative estimates for asymptotic normality of the Pearson correlation for stationary processes like the Ornstein-Uhlenbeck process, as well as determining when the graininess of the time scale affects the correlation's distribution, and when it does not. In fact, the second-chaos AR(1) processes will all converge, under proper scaling, to the Gaussian Ornstein-Uhlenbeck process, but we are interested in mesoscopic scales where the fluctuations' normality is too distant to be relied upon.

The relevant notions of tests of independence of pairs of paths of stochastic processes we have considered, and which we will continue to consider, are manifold: from short-range to long-range correlation for individual paths, to whether single pairs or finite sets of paths are statistically related, and to applied consequences when considering questions of attribution of factors for real-world phenomena, particularly relating to weather and climate. Important applied aims include an investigation of climate-related risks, such as sea-level rise and extreme weather events, particularly in the North Atlantic Ocean, how they correlate dynamically over medium and long terms, and how heavy-tailed they are. We would be remiss if we did not highlight the recent misuse of Pearson correlation in the area of late-Holocene paleoclimatology, an area whose importance for projecting our planet's climate in the next 200 or 1000 years cannot be overstated.

 

Finally, our proposal has also significantly contributed to both graduate and undergraduate training. The support of the NSF has enabled us to train four Ph.D. students, all of which are now active contributors to the STEM research community. We have also engaged with undergraduate and masters students on aspects of this proposal relevant to climate science, to agricultural economics, and to development economics in least-developed countries.

 


Last Modified: 12/14/2022
Modified by: Philip Ernst

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page