
NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | August 10, 2021 |
Latest Amendment Date: | June 27, 2022 |
Award Number: | 2140982 |
Award Instrument: | Continuing Grant |
Program Manager: |
Yong Zeng
yzeng@nsf.gov (703)292-7299 DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences |
Start Date: | September 1, 2021 |
End Date: | August 31, 2024 (Estimated) |
Total Intended Award Amount: | $103,360.00 |
Total Awarded Amount to Date: | $103,360.00 |
Funds Obligated to Date: |
FY 2022 = $51,686.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
101 COMMONWEALTH AVE AMHERST MA US 01003-9252 (413)545-0698 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
710 N. Pleasant Street Amherst MA US 01003-9305 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CDS&E-MSS |
Primary Program Source: |
01002223DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
In this era of Big Data, deep learning has become a burgeoning domain with immense potential to advance science, technology, and human life. Despite the tremendous practical success of deep neural networks (DNNs) in various data-intensive machine learning applications, there remain many open problems to be addressed: (1) DNNs tend to suffer from overfitting when the available training data are scarce, which renders them less effective in the small data regime. (2) DNNs have been shown to have the capability of perfectly ?memorizing? random training samples, making them less trustworthy when the training data are noisy and corrupted. (3) While symmetry is ubiquitous in machine learning (e.g., in image classification, the class label of an image remains the same if the image is spatially rescaled and translated,) generic DNN architectures typically destroy such symmetry in the representation, which leads to significant redundancy in the model to ?memorize? such information from the data. The goal of this project is to address these challenges in deep learning by exploiting the low-dimensional geometry and symmetry within the data and their network representations, aiming at developing new theories and methodologies for deep learning regularization that can lead to tangible advances in machine learning and artificial intelligence, especially in the small/corrupted data regime. In addition, the project also provides research training opportunities for postdocs.
The overarching theme of this project is to leverage recent progress in mathematical methods from differential geometry and applied harmonic analysis to improve the stability, reliability, data efficiency, and interpretability of deep learning. This will involve developing both foundational theories and efficient algorithms to achieve the following three objectives: (1) developing manifold-based DNN regularizations with significantly improved generalization performance by focusing on the topology and geometry of both the input data and their representations. This will unlock the potential of deep learning in the small data regime. (2) Establishing and analyzing an innovative framework of imposing geometric constraints in deep learning that has immense potential to limit the memorizing capacity of DNN. The mathematical analysis of the training dynamics of such a model will shed light on the understanding of the fundamental difference between ?memorization? and generalization in deep learning. (3) The construction of deformation robust symmetry-preserving DNN architectures for various symmetry transformations on different data domains. By "hardwiring" the symmetry information into the deformation robust representations, the regularized DNN models will have improved performance and interpretability with reduced redundancy and model size. In terms of application, the project will demonstrate and deploy the proposed theories in real-world machine learning tasks, such as object recognition, localization, and segmentation. The techniques developed in this project will be widely applicable across different disciplines, providing fundamental building blocks for the next generation of mathematical tools for the computational modeling of Big Data.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
During the funding period, the PI and collaborators successfully completed 17 academic papers, with 13 published in leading machine learning conferences and applied mathematics journals. These publications include 1 paper in ICLR, 1 in ICCV, 2 in NeurIPS, 1 in JMLR, and 2 in ICML, while four additional papers are currently under review.
The project met its three primary Intellectual Merit objectives:
1. Data-Dependent Regularization for Deep Neural Networks (DNNs): The team developed innovative regularization techniques grounded in data geometry. These methods significantly enhanced the generalization performance of DNN models, particularly in scenarios where training data was scarce.
2. Geometry-Based Regularization for DNNs with Corrupted Data: We refined a regularization approach that mitigates the tendency of DNNs to overfit or "memorize" noisy or corrupted training data. This technique enhances the robustness of DNNs in practical applications where data quality may be compromised.
3. Deformation-Robust, Symmetry-Preserving DNN Framework: The team established a general framework for designing DNNs that are resistant to data deformations while maintaining inherent symmetries. The performance improvements of these models were rigorously measured, and the framework was extended beyond predictive models to encompass structure-preserving generative models, expanding its range of applications.
In terms of Broader Impact, the project provided significant educational and research opportunities. Several undergraduate students participated through summer REU programs, and graduate students and postdoctoral researchers also benefited from the project’s support. Additionally, the PI co-developed and taught a year-long graduate course, “Mathematical Foundations of Machine Learning”, at UMass Amherst. This course was designed for graduate students and advanced undergraduates interested in the theoretical foundations of machine learning. The PI also served as a key instructor for “Introduction to Foundations of Data Science”, a two-week summer course offered to high school students at UMass Amherst. This course introduced students to data science and encouraged them to pursue STEM degrees in college. The research outcomes from the project were seamlessly incorporated into these educational programs, providing students with hands-on experience in cutting-edge machine learning techniques.
All software and datasets generated from the project have been made publicly available, ensuring that the broader research community can benefit from the tools and insights developed.
Last Modified: 12/05/2024
Modified by: Wei Zhu
Please report errors in award information by writing to: awardsearch@nsf.gov.