Skip to feedback

Award Abstract # 1849107
S&AS: FND: Context-Aware Active Data Gathering for Complex Outdoor Environments

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: REGENTS OF THE UNIVERSITY OF MINNESOTA
Initial Amendment Date: January 29, 2019
Latest Amendment Date: January 29, 2019
Award Number: 1849107
Award Instrument: Standard Grant
Program Manager: James Donlon
jdonlon@nsf.gov
 (703)292-8074
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: February 1, 2019
End Date: January 31, 2024 (Estimated)
Total Intended Award Amount: $599,962.00
Total Awarded Amount to Date: $599,962.00
Funds Obligated to Date: FY 2019 = $599,962.00
History of Investigator:
  • Qi Zhao (Principal Investigator)
    qzhao@umn.edu
  • Ibrahim Isler (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Minnesota-Twin Cities
2221 UNIVERSITY AVE SE STE 100
MINNEAPOLIS
MN  US  55414-3074
(612)624-5599
Sponsor Congressional District: 05
Primary Place of Performance: University of Minnesota
200 Union Street SE
Minneapolis
MN  US  55455-0167
Primary Place of Performance
Congressional District:
05
Unique Entity Identifier (UEI): KABJZBBJ4B54
Parent UEI:
NSF Program(s): S&AS - Smart & Autonomous Syst
Primary Program Source: 01001920DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 046Z
Program Element Code(s): 039Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Traditional agents are programmed to acquire information by recognizing and attending to predetermined areas and targets in a given environment. Recent advances in deep learning models and miniature hardware platforms are providing artificial agents unprecedented capability in processing and interpreting visual data. These advancements create an exciting opportunity to build intelligent machines running with greater autonomy and adaptability. Toward this goal, this project investigates new methods that enable multiple unmanned aerial systems to understand and explore complex outdoor environment by actively seeking, acquiring, integrating, and processing visual information across space and time. The developed framework with enhanced adaptability, self-awareness, and generalizability will be applicable to autonomous systems in broad applications such as environmental monitoring, search and rescue, self-driving cars, smart health, and manufacturing domains. Throughout the project, the principal investigators will make project results including created datasets, trained models, code, and papers publicly available. The new integrative research combining vision, planning and actuation will be incorporated into teaching materials, underrepresented and undergraduate research projects, as well as K-12 outreach activities.

The project seeks to develop algorithms for context-aware active sensing which also incorporate energy constraints. This will be achieved by: First, proposing new deep learning models for holistic attention prediction with multiple aerial views. The models will leverage external knowledge to enable inference and generalization in unseen contexts. Second, by developing new view and path planning methods that are efficient and aware of systems' energy, mobility and sensing constraints. Third, by contributing novel online learning methods that adapt based on uncertainty to implement adaptiveness and awareness to changing environment. Experiments to validate the findings will take place both indoors in the newly-renovated Shepherd UAV Lab at the University of Minnesota and in the field at the Cedar Creek Ecosystem Reserve. The results of this project have the potential to inspire further research into intelligent and integrative perceptual, planning and actuation systems.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 24)
Chen, Xianyu and Jiang, Ming and Zhao, Qi "Self-Distillation for Few-Shot Image Captioning" IEEE Winter Conference on Applications of Computer Vision , 2021 https://doi.org/ Citation Details
Jiang, Ming and Chen, Shi and Yang, Jinhui and Zhao, Qi "AiR: Attention with Reasoning Capability" European Conference on Computer Vision , 2020 https://doi.org/10.1007/978-3-030-58452-8_6 Citation Details
Chen, Shi and Jiang, Ming and Yang, Jinhui and Zhao, Qi "AiR: Attention with Reasoning Capability" European Conference on Computer Vision , 2020 Citation Details
Chen, Shi and Jiang, Ming and Yang, Jinhui and Zhao, Qi "Attention in Reasoning: Dataset, Analysis, and Modeling" IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 https://doi.org/10.1109/TPAMI.2021.3114582 Citation Details
Chen, Shi and Zhao, Qi "Attention based Autism Spectrum Disorder Screening with Privileged Modality" IEEE International Conference on Computer Vision (ICCV) , 2019 Citation Details
Chen, Shi and Zhao, Qi "Attention-Based Autism Spectrum Disorder Screening With Privileged Modality" 2019 IEEE/CVF International Conference on Computer Vision (ICCV) , 2019 10.1109/ICCV.2019.00127 Citation Details
Chen, Shi and Zhao, Qi "Attention to Action: Leveraging Attention for Object Navigation" The British Machine Vision Conference , 2021 Citation Details
Chen, Shi and Zhao, Qi "REX: Reasoning-aware and grounded explanation" IEEE Conference on Computer Vision and Pattern Recognition , 2022 Citation Details
Chen, Xianyu and Jiang, Ming and Zhao, Qi "Predicting Human Scanpaths in Visual Question Answering" IEEE Conference on Computer Vision and Pattern Recognition , 2021 https://doi.org/10.1109/CVPR46437.2021.01073 Citation Details
Chen, Xianyu and Jiang, Ming and Zhao, Qi "Predicting Human Scanpaths in Visual Question Answering" IEEE Conference on Computer Vision and Pattern Recognition , 2021 Citation Details
Jiang, Ming and Chen, Shi and Yang, Jinhui and Zhao, Qi "Fantastic Answers and Where to Find Them: Immersive Question-Directed Visual Attention" IEEE Conference on Computer Vision and Pattern Recognition , 2020 https://doi.org/10.1109/CVPR42600.2020.00305 Citation Details
(Showing: 1 - 10 of 24)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The project focuses on advancing UAV capabilities to intelligently explore complex environments using recent advances in deep learning and miniature hardware. We aim to create a dynamic multi-UAV system that enhances self-awareness and generalizability by actively integrating and processing visual data across space and time. The system optimizes view plans and path planning for energy efficiency, with applications spanning environmental monitoring, search and rescue, autonomous vehicles, smart health, and manufacturing.

To develop advanced attention models capable of integrating video data from multiple UAVs comprehensively, we curated extensive attention and reasoning datasets. Our novel attention model considered both correct and incorrect attention patterns to refine machine learning. Additionally, we conducted the first research on predicting scanpaths during diverse tasks, achieving superior accuracy in capturing human attention's spatio-temporal dynamics compared to existing methods. We introduced novel datasets and models to facilitate the integration of multi-modal inputs, enhancing decision-making transparency and performance across varied applications. Furthermore, our development of a trustworthiness predictor enhances prediction accuracy by distinguishing distributions of positive and negative examples, thereby optimizing decision-making under complex real-world conditions.

To enhance the adaptability and performance of attention models in dynamic environments, we integrated external knowledge directly into our models. We introduced the first explicit and explainable visual reasoning method that integrated external knowledge through dynamic scene graphs and functional programs, augmented by a novel reinforcement learning method. We also defined new attention mechanisms tailored for embodied settings and developed methods that tightly integrated perception with action for embodied navigation. Experimental results demonstrated substantial improvements in navigation across diverse, previously unseen environments. 

We have developed cutting-edge view and path planning algorithms tailored for UAVs to enhance the efficiency and quality of image capture. Our research focused on optimizing view planning in complex scenarios such as disaster response and urban environments, ensuring high-quality 3D geometry and texture representation while minimizing path length deviations. By converting quality requirements into mathematical constraints, we introduced adaptive viewing planes that adapt naturally to scene geometries, significantly reducing reconstruction errors compared to traditional methods. Addressing scalability challenges, we innovatively transformed the coverage problem into a cone-based variant of the Traveling Salesman Problem (cone-TSPN), achieving polynomial runtime and constant memory usage for large-scale city coverage, a significant improvement over previous techniques. Additionally, we developed algorithms that compute visual coverage trajectories optimizing total length while maintaining specified detection probabilities. Finally, we introduced a novel route-finding approach integrating perception and travel costs, employing an entropy-based viewing score to generate diameter-bounded viewing neighborhoods. 

Furthermore, this project also involves developing methods to enhance online learning in dynamic environments. We measured agreement between new and prior knowledge, implementing Direction Concentration Learning (DCL) to optimize updates in deep neural networks. Additionally, our gradient-based few-shot learning methods enabled efficient knowledge transfer from large-scale datasets to domains with limited samples. We proposed a learning approach leveraging privileged information from multiple modalities during training, enhancing testing performance without necessitating one-to-one modal relationships. Finally, our Gradient Adjustment Learning technique optimized gradients by incorporating knowledge from previous iterations, demonstrating significant improvements across diverse applications.

The project has shaped artificial intelligence and robotics by enhancing autonomous systems' capabilities to navigate and adapt in complex environments. It has also established frameworks for transparent decision-making and improved generalizability across diverse applications, focusing on attention, generalization, and adaptive action planning. Public accessibility of datasets, analyses, and code fostered transparency and community engagement, reflected in 26 publications at conferences like CVPR, ECCV, ICCV, ICRA, and NeurIPS. The project involved seven PhD students in computer science and robotics, fostering their expertise and skills in these fields.


 

 


Last Modified: 06/28/2024
Modified by: Qi Zhao

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page