
NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | August 24, 2010 |
Latest Amendment Date: | August 24, 2010 |
Award Number: | 1007594 |
Award Instrument: | Standard Grant |
Program Manager: |
Gabor Szekely
DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences |
Start Date: | September 1, 2010 |
End Date: | August 31, 2013 (Estimated) |
Total Intended Award Amount: | $159,986.00 |
Total Awarded Amount to Date: | $159,986.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
426 AUDITORIUM RD RM 2 EAST LANSING MI US 48824-2600 (517)355-5040 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
426 AUDITORIUM RD RM 2 EAST LANSING MI US 48824-2600 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | STATISTICS |
Primary Program Source: |
|
Program Reference Code(s): | |
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
This research project provides simultaneous confidence regions for various functional features in functional data analysis (FDA), with asymptotic theory and guide to practical implementation. Specifically, asymptotically correct confidence regions will be constructed for (1) the mean function of functional data and the coefficient function in varying coefficient longitudinal regression model; and (2) the covariance function of functional data and the regression function in functional linear model. For the simpler functions in (1), the investigator will employ both regression spline and local polynomial methods in order to establish rigorous asymptotic theory for both sparse and dense function data. Results on partial sum strong approximation by Brownian motions and advanced extreme value theory for sequences of non-stationary Gaussian processes will be applied to obtain distributional properties of the maximal deviation processes. For the more complicated functions in (2), the investigator will propose two-step estimators and show that it is asymptotically as efficient as some ?infeasible? analogs. Asymptotic distributions for maximal deviations are established for the ?infeasible estimators? which are then inherited by the two-step estimators.
Functional data, also known as curve data, consist of collections of digitally recorded curves or surfaces, often with random errors. Such data abound in virtually all scientific disciplines, including but not limited to, climatology, clinical studies, epidemiology, evolutionary biology and food engineering/science. The need to draw information out of a sample of curves, coupled with the unleashing of modern computing power, has made functional data analysis (FDA) one of the most active areas of contemporary statistics research. While multivariate statistics is about unknown vectors and matrices, FDA concerns unknown curves and surfaces, which is most naturally done with confidence regions. The methods developed by the investigator fill a major gap in the current FDA methodology, which lacks procedures to make conclusions on an entire curve with quantifiable uncertainty. Codes written in common software packages such as Matlab or R will be freely distributed so practitioners from academia and industry for analyzing functional data in real time, with own chosen significance levels. Completing this project depends crucially on several capable Ph. D. students working under the investigator?s supervision, so state-of-the-art research is integrated with the training of graduate students as future researchers, consistent with NSF's education goal.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Functional data, also figuratively called curve data, consist of random collections of digitally recorded sample curves or surfaces, often contaminated with measurement errors. Since 2000, the study of functional data has been a focal point in main stream statistics research, as such data pour in from virtually all scientific disciplines, including but not limited to, climatology, clinical studies, epidemiology, evolutionary biology and food engineering/science.
While there has been a massive amount of research in functional data analysis (FDA), a mere fraction of it addresses the critical issue of statistical inference, namely, drawing conclusions about an entire curve or surface of interest with quantifiable uncertainty. While classic mathematical statistics provides data analysts with confidence intervals for single parameters and joint confidence regions for multiple parameters, analogous constructs in the context of FDA almost did not exist prior to this project.
The most natural tools for drawing intelligent conclusions on unknown curves/surfaces are confidence bands/envelopes, which are simply two/three dimensional regions enclosed by an upper confdence curve/surface and a lower one, both constructed from the data. At the completion of this project, several types of simultaneous confidence bands have been made available for the mean of functional data of both sparse and dense type (i.e., each sample curve may have been recorded over a small or large number of points). Simultaneous confidence envelope has also been provided for the covariance surface of functional data which is not affected by the mean function.
Codes written in the popular software package R to compute confidence band for sparse functional data will be available on the internet so practitioners from academia and industry can use it freely for analyzing functional data in real time, with own chosen significance levels. Three Ph. D. students had worked on the project and had been trained to capable researchers in FDA, consistent with NSF's education goal.
Last Modified: 11/29/2013
Modified by: Lijian Yang
Please report errors in award information by writing to: awardsearch@nsf.gov.