
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | September 10, 2018 |
Latest Amendment Date: | August 5, 2019 |
Award Number: | 1838200 |
Award Instrument: | Continuing Grant |
Program Manager: |
Hector Munoz-Avila
IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2018 |
End Date: | September 30, 2022 (Estimated) |
Total Intended Award Amount: | $950,337.00 |
Total Awarded Amount to Date: | $950,337.00 |
Funds Obligated to Date: |
FY 2019 = $147,679.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
201 DOWMAN DR NE ATLANTA GA US 30322-1061 (404)727-2503 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
400 Dowman Dr Atlanta GA US 30322-1005 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Big Data Science &Engineering |
Primary Program Source: |
01001920DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Spatio-temporal analyses can enable many discoveries including reducing traffic congestion, identifying hotspot areas to deploy mobile clinics, and urban planning. Unfortunately, the data poses many computational challenges. Standard assumptions in machine learning and data mining algorithms are violated by the complex nature of spatio-temporal data. These include spatial and temporal correlation of observations, dynamic and abrupt changes in observations, variability in measurements with respect to length and frequency, and multi-sourced data that spans multiple sources of information. In recognition of these challenges, various efforts have been undertaken to develop specialized spatiotemporal models. Yet, to date, these algorithms are predominately designed to analyze small- to medium-sized datasets. The goal of this project is to develop a comprehensive computational tensor platform to perform automated, data-driven discovery from spatio-temporal data across a broad range of applications. The project also includes a set of integrated educational activities such as a Massive Open Online Course that covers cross-disciplinary topics at the confluence of computer science and geospatial applications, annual spatio-temporal data challenges and hackathons, and an annual event at the Atlanta Science Festival to create public awareness and encourage participation by women and minorities.
The project will contain algorithmic innovations that reflect appropriate assumptions of spatio-temporal data without sacrificing real-time performance, computational scalability, and cross-site learning even under privacy constraints. The proposed platform will generalize tensor modeling to encompass the complex nature of spatio-temporal data including time irregularity, spatiotemporal correlations, and evolving distributions. It will enable the integration of multi-sourced data from heterogeneous sources to yield robust and cohesive learned patterns. The novel algorithms will also facilitate learning in decentralized settings while preserving privacy. The computational platform will contain interchangeable modules that can adapt to new spatio-temporal settings and incorporate additional contextual information. The accompanying suite of algorithms will enable predictive learning, pattern mining, and change detection from large-sized spatio-temporal data. The broad applicability of the project will be demonstrated on a diverse range of data including urban transportation services, real estate market transactions, and population health. The algorithmic innovations introduced can be used to scale other machine learning models.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Spatiotemporal data poses computational challenges due to the inter-dependencies amongst the observations that are not readily encapsulated in common data structures. Tensors, a generalization of vectors and matrices to multiway data, are natural representations for capturing the high-dimensional interactions across space and time. By leveraging the powerful and flexible tensor data structure, automated and data-driven discovery can be performed on spatiotemporal data. The project outcomes includes a suite of algorithmic developments and theoretical advancements to scale and distribute tensor analysis, as well as validation across a variety of applications.
The intellectual merit is highlighted with the delivery of 1) new spatiotemporal tensor factorization models, 2) new methodologies to support data analysis under the streaming setting where all the data cannot be readily stored or accessed more than once, 3) new scalable algorithms that do not require a high performance machine and offer faster convergence, 4) a new privacy-preserving federated tensor factorization model that offers differential privacy guarantees, and 5) the first, communication-efficient, decentralized tensor factorization model that works for multiple network topologies. Furthermore, the project successfully validated the broad applicability of tensor factorization in multiple domains including the healthcare, urban transportation, social media, and crime prediction. The project outcomes have been disseminated in various conference venues, workshop, tutorials, and invited talks in the fields of machine learning, data mining, and medical informatics. The findings and algorithms have been incorporated into multiple Emory courses. The project supported 1 postdoctoral fellow, 11 PhD students, and 5 undergraduates. The project has also taken steps towards advancing diversity, equity, and inclusion in the sciences by supporting 7 graduate and 4 undergraduate students from underrepresented groups.
Last Modified: 01/17/2023
Modified by: Joyce C Ho
Please report errors in award information by writing to: awardsearch@nsf.gov.