Award Abstract # 1651565
CAREER: Modeling and Inference for Large Scale Spatio-Temporal Data

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: THE LELAND STANFORD JUNIOR UNIVERSITY
Initial Amendment Date: March 8, 2017
Latest Amendment Date: January 28, 2019
Award Number: 1651565
Award Instrument: Continuing Grant
Program Manager: Kenneth Whang
kwhang@nsf.gov
 (703)292-5149
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: March 15, 2017
End Date: February 29, 2024 (Estimated)
Total Intended Award Amount: $540,000.00
Total Awarded Amount to Date: $540,000.00
Funds Obligated to Date: FY 2017 = $102,088.00
FY 2018 = $104,946.00

FY 2019 = $332,966.00
History of Investigator:
  • Stefano Ermon (Principal Investigator)
    ermon@cs.stanford.edu
Recipient Sponsored Research Office: Stanford University
450 JANE STANFORD WAY
STANFORD
CA  US  94305-2004
(650)723-2300
Sponsor Congressional District: 16
Primary Place of Performance: Stanford University
353 Serra Mall, Room 228
Stanford
CA  US  94305-5008
Primary Place of Performance
Congressional District:
16
Unique Entity Identifier (UEI): HJD6G4D6TJY5
Parent UEI:
NSF Program(s): Robust Intelligence
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
01001819DB NSF RESEARCH & RELATED ACTIVIT

01001920DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 7495
Program Element Code(s): 749500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Key sustainability challenges, such as poverty mitigation, climate change, and food security, involve global phenomena that are unique in scale and complexity. Our global sensing capabilities - from remote sensing to crowdsourcing - are becoming increasingly economical and accurate. These recent technological developments are creating new spatio-temporal data streams that contain a wealth of information relevant to sustainable development goals. Actionable insights, however, cannot be easily extracted because the sheer size and unstructured nature of the data preclude traditional analysis techniques. This five-year career-development plan is an integrated research, education, and outreach program focused on developing new AI techniques to extract actionable insights from large-scale spatio-temporal data. These techniques have the potential to yield accurate, inexpensive, and highly scalable models to inform research and policy.

The research goal of this project is to develop new modeling and algorithmic frameworks to help address global sustainability challenges involving spatio-temporal data. This research will develop new predictive models of complex spatio-temporal phenomena integrating in unique ways ideas from graphical models and representation learning, improving their overall performance. New approaches to learn from unlabeled data exploiting various forms of prior domain knowledge, including spatio-temporal dependencies and relationships between different data modalities, will be developed. To learn models and make predictions at scale, this project will also develop new scalable probabilistic inference methods based on the use of random projections to reduce the dimensionality of probabilistic models while preserving their key properties. The techniques developed will be made available to both academia and industry through open-source software, and will enable computationally feasible approaches for analyzing large spatio-temporal datasets and for modeling global scale phenomena. Predictions and data products produced by this project will enable new analyses and advance sustainability disciplines. Results will be disseminated widely through scientific articles, research seminars, and conference presentations to maximize the benefits to the scientific community. Educational and outreach efforts will include the involvement of undergraduate students undertaking independent research projects, a website describing research bridging computation and, and a summer outreach program aimed at introducing under-represented high-school students to computer science and artificial intelligence.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 43)
Ayush, Kumar and Uzkent, Burak and Burke, Marshall and Lobell, David and Ermon, Stefano "Efficient Poverty Mapping from High Resolution Remote Sensing Images" Proceedings of the AAAI Conference on Artificial Intelligence , 2021 Citation Details
Ayush, Kumar and Uzkent, Burak and Burke, Marshall and Lobell, David and Ermon, Stefano "Generating Interpretable Poverty Maps using Object Detection in Satellite Images" International Joint Conferences on Artificial Intelligence , 2020 https://doi.org/ Citation Details
Ayush, Kumar and Uzkent, Burak and Meng, Chenlin and Tanmay, Kumar and Burke, Marshall and Lobell, David and Ermon, Stefano "Geography-Aware Self-Supervised Learning" IEEE International Conference on Computer Vision workshops , 2021 https://doi.org/10.1109/ICCV48922.2021.01002 Citation Details
Burke, Marshall and Driscoll, Anne and Lobell, David B. and Ermon, Stefano "Using satellite imagery to understand and promote sustainable development" Science , v.371 , 2021 https://doi.org/10.1126/science.abe8628 Citation Details
Cong, Yezhen and Khanna, Samar and Meng, Chenlin and Liu, Patrick and Rozi, Erik and He, Yutong and Burke, Marshall and Lobell, David and Ermon, Stefano "SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery" Advances in neural information processing systems , 2022 Citation Details
Cundy, Chris and Ermon, Stefano "Flexible Approximate Inference via Stratified Normalizing Flows" Proceedings of UAI , 2020 https://doi.org/ Citation Details
Eissman, Stephan and Levy, Daniel and Shu, Rui and Bartzsch, Stefan and Ermon, Stefano "Bayesian Optimization and Attribute Adjustment" Proc. 34th Conference on Uncertainty in Artificial Intelligence , 2018 Citation Details
Grover, Aditya and Ermon, Stefano "Uncertainty Autoencoders: Learning Compressed Representations via Variational Information Maximization" In Proc. 35th Conference on Uncertainty in Artificial Intelligence, 2019 , 2019 Citation Details
Grover, Aditya and Gummadi, Ramki and Lazaro-Gredilla, Miguel and Schuurmans, Dale and Ermon, Stefano "Variational Rejection Sampling" Proc. 21st International Conference on Artificial Intelligence and Statistics , 2018 Citation Details
He, Yutong and Wang, Dingjie and Lai, Nicholas and Zhang, William and Meng, Chenlin and Burke, Marshall and Lobell, David and Ermon, Stefano "Spatial-Temporal Super-Resolution of Satellite Imagery via Conditional Pixel Synthesis" Advances in neural information processing systems , 2021 Citation Details
Jean, Neal and Wang, Sherrie and Samar, Anshul and Azzari, George and Lobell, David and Ermon, Stefano "Tile2Vec: Unsupervised Representation Learning for Spatially Distributed Data" Proceedings of the AAAI Conference on Artificial Intelligence , v.33 , 2019 10.1609/aaai.v33i01.33013967 Citation Details
(Showing: 1 - 10 of 43)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Our worldwide sensing capabilities, enabled by technologies ranging from remote sensing to crowdsourcing, are becoming more cost-effective and precise. These technological advances have led to the creation of new spatio-temporal data streams (such as frequent, high-resolution satellite images) that are rich in information pertinent to sustainable development objectives. However, deriving actionable insights is challenging due to the vast size and unstructured format of the data, which hinder conventional analysis methods. During the course of this project we developed a variety of machine learning approaches to automatically analyze spatio-temporal data streams (such as satellite images collected around the world), deriving important insights into key sustainability challenges such as poverty and climate.

As a first thrust, we developed a set of techniques to reduce the amount of training data needed to build machine learning models that use remote sensing data as an input. These techniques include both unsupervised, semi-supervised, and self-supervised learning models that can take advantage of large amounts of unlabeled data. For example, we developed GeoSSL and SatMAE, the first foundation models specifically developed for remote sensing data. Both these models are trained in a self-supervised way – without requiring any human feedback – to identify structure and features common in satellite images. These models can then be finetuned on a variety of downstream tasks (e.g., identifying objects in satellite images) using a small amount of training data, often achieving state-of-the-art results on relevant benchmark tasks.

As a second thrust, we developed a variety of techniques to enhance the scalability and reliability of probabilistic machine learning models. These include (1) approximate inference techniques that can scale to large numbers of variables in the models, (2) generative models where inference is tractable by design, (3) adaptive data acquisition approaches to reduce costs, and (4) uncertainty quantification techniques to assess confidence in probabilistic predictions obtained with machine learning models. 

We have applied these techniques and developed models that can predict a variety of important socio-economic indicators at high spatial and temporal resolution across large geographies. For example, we built (1) the first deep learning models that are able to estimate poverty directly from satellite images, achieving accuracies comparable to traditional survey-based measures (2) deep learning models capable of predicting crop yields directly from space, and applied them both in the United States and internationally, (3) models that can track the quality of infrastructure across the world, (4) models that can predict key population health indicators, (5) models that can identify brick kilns from space, tracking compliance with environmental regulation.

On the education side, undergraduate and graduate students were involved in all aspects of the research activities described above. Throughout the process, they received training in research and computational thinking. A new class focused on machine learning for sustainability applications was also developed at Stanford.

 


Last Modified: 04/11/2024
Modified by: Stefano Ermon

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page