NSF Award Search: Award # 2140982

Award Abstract # 2140982

EAGER: CDS&E: Applied geometry and harmonic analysis in deep learning regularization: theory and applications

NSF Org:	DMS Division Of Mathematical Sciences
Recipient:	UNIVERSITY OF MASSACHUSETTS
Initial Amendment Date:	August 10, 2021
Latest Amendment Date:	June 27, 2022
Award Number:	2140982
Award Instrument:	Continuing Grant
Program Manager:	Yong Zeng yzeng@nsf.gov (703)292-7299 DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences
Start Date:	September 1, 2021
End Date:	August 31, 2024 (Estimated)
Total Intended Award Amount:	$103,360.00
Total Awarded Amount to Date:	$103,360.00
Funds Obligated to Date:	FY 2021 = $51,674.00 FY 2022 = $51,686.00
History of Investigator:	Wei Zhu (Principal Investigator) weizhu@gatech.edu
Recipient Sponsored Research Office:	University of Massachusetts Amherst 101 COMMONWEALTH AVE AMHERST MA US 01003-9252 (413)545-0698
Sponsor Congressional District:	02
Primary Place of Performance:	University of Massachusetts Amherst 710 N. Pleasant Street Amherst MA US 01003-9305
Primary Place of Performance Congressional District:	02
Unique Entity Identifier (UEI):	VGJHK59NMPK9
Parent UEI:	VGJHK59NMPK9
NSF Program(s):	CDS&E-MSS
Primary Program Source:	01002122DB NSF RESEARCH & RELATED ACTIVIT 01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	079Z, 7916, 9263
Program Element Code(s):	806900
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.049

ABSTRACT

In this era of Big Data, deep learning has become a burgeoning domain with immense potential to advance science, technology, and human life. Despite the tremendous practical success of deep neural networks (DNNs) in various data-intensive machine learning applications, there remain many open problems to be addressed: (1) DNNs tend to suffer from overfitting when the available training data are scarce, which renders them less effective in the small data regime. (2) DNNs have been shown to have the capability of perfectly ?memorizing? random training samples, making them less trustworthy when the training data are noisy and corrupted. (3) While symmetry is ubiquitous in machine learning (e.g., in image classification, the class label of an image remains the same if the image is spatially rescaled and translated,) generic DNN architectures typically destroy such symmetry in the representation, which leads to significant redundancy in the model to ?memorize? such information from the data. The goal of this project is to address these challenges in deep learning by exploiting the low-dimensional geometry and symmetry within the data and their network representations, aiming at developing new theories and methodologies for deep learning regularization that can lead to tangible advances in machine learning and artificial intelligence, especially in the small/corrupted data regime. In addition, the project also provides research training opportunities for postdocs.

The overarching theme of this project is to leverage recent progress in mathematical methods from differential geometry and applied harmonic analysis to improve the stability, reliability, data efficiency, and interpretability of deep learning. This will involve developing both foundational theories and efficient algorithms to achieve the following three objectives: (1) developing manifold-based DNN regularizations with significantly improved generalization performance by focusing on the topology and geometry of both the input data and their representations. This will unlock the potential of deep learning in the small data regime. (2) Establishing and analyzing an innovative framework of imposing geometric constraints in deep learning that has immense potential to limit the memorizing capacity of DNN. The mathematical analysis of the training dynamics of such a model will shed light on the understanding of the fundamental difference between ?memorization? and generalization in deep learning. (3) The construction of deformation robust symmetry-preserving DNN architectures for various symmetry transformations on different data domains. By "hardwiring" the symmetry information into the deformation robust representations, the regularized DNN models will have improved performance and interpretability with reduced redundancy and model size. In terms of application, the project will demonstrate and deploy the proposed theories in real-world machine learning tasks, such as object recognition, localization, and segmentation. The techniques developed in this project will be widely applicable across different disciplines, providing fundamental building blocks for the next generation of mathematical tools for the computational modeling of Big Data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Birrell, J. and Katsoulakis, M. and Rey-Bellet, L. and Zhu, W. "Structure-preserving GANs" Proceedings of Machine Learning Research , v.162 , 2022 Citation Details

Chen, Z and Katsoulakis, M and Rey-Bellet, L and Zhu, W "Sample Complexity of Probability Divergences under Group Symmetry" Proceedings of Machine Learning Research , v.202 , 2023 Citation Details

Chen, Ziyu and Zhu, Wei "On the Implicit Bias of Linear Equivariant Steerable Networks" , 2023 Citation Details

Gao, L. and Lin, G. and Zhu, W. "Deformation Robust Roto-Scale-Translation Equivariant CNNs" Transactions on machine learning research , 2022 Citation Details

Saqlain, Sheikh and Zhu, Wei and Charalampidis, Efstathios G and Kevrekidis, Panayotis G "Discovering governing equations in discrete systems using PINNs" Communications in Nonlinear Science and Numerical Simulation , v.126 , 2023 https://doi.org/10.1016/j.cnsns.2023.107498 Citation Details

Yang, Su and Chen, Shaoxuan and Zhu, Wei and Kevrekidis, P G "Identification of moment equations via data-driven approaches in nonlinear Schrödinger models" Frontiers in Photonics , v.5 , 2024 https://doi.org/10.3389/fphot.2024.1444993 Citation Details

Zhu, Wei and Khademi, Wesley and Charalampidis, Efstathios G. and Kevrekidis, Panayotis G. "Neural networks enforcing physical symmetries in nonlinear dynamical lattices: The case example of the AblowitzLadik model" Physica D: Nonlinear Phenomena , v.434 , 2022 https://doi.org/10.1016/j.physd.2022.133264 Citation Details

Zhu, Wei and Qiu, Qiang and Calderbank, Robert and Sapiro, Guillermo and Cheng, Xiuyuan "Scaling-Translation-Equivariant Networks with Decomposed Convolutional Filters" Journal of machine learning research , v.23 , 2022 Citation Details

Zhu, Wei and Zhang, Hong-Kun and Kevrekidis, P G "Machine learning of independent conservation laws through neural deflation" Physical Review E , v.108 , 2023 https://doi.org/10.1103/PhysRevE.108.L022301 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

During the funding period, the PI and collaborators successfully completed 17 academic papers, with 13 published in leading machine learning conferences and applied mathematics journals. These publications include 1 paper in ICLR, 1 in ICCV, 2 in NeurIPS, 1 in JMLR, and 2 in ICML, while four additional papers are currently under review.

The project met its three primary Intellectual Merit objectives:

1. Data-Dependent Regularization for Deep Neural Networks (DNNs): The team developed innovative regularization techniques grounded in data geometry. These methods significantly enhanced the generalization performance of DNN models, particularly in scenarios where training data was scarce.

2. Geometry-Based Regularization for DNNs with Corrupted Data: We refined a regularization approach that mitigates the tendency of DNNs to overfit or "memorize" noisy or corrupted training data. This technique enhances the robustness of DNNs in practical applications where data quality may be compromised.

3. Deformation-Robust, Symmetry-Preserving DNN Framework: The team established a general framework for designing DNNs that are resistant to data deformations while maintaining inherent symmetries. The performance improvements of these models were rigorously measured, and the framework was extended beyond predictive models to encompass structure-preserving generative models, expanding its range of applications.

In terms of Broader Impact, the project provided significant educational and research opportunities. Several undergraduate students participated through summer REU programs, and graduate students and postdoctoral researchers also benefited from the project’s support. Additionally, the PI co-developed and taught a year-long graduate course, “Mathematical Foundations of Machine Learning”, at UMass Amherst. This course was designed for graduate students and advanced undergraduates interested in the theoretical foundations of machine learning. The PI also served as a key instructor for “Introduction to Foundations of Data Science”, a two-week summer course offered to high school students at UMass Amherst. This course introduced students to data science and encouraged them to pursue STEM degrees in college. The research outcomes from the project were seamlessly incorporated into these educational programs, providing students with hands-on experience in cutting-edge machine learning techniques.

All software and datasets generated from the project have been made publicly available, ensuring that the broader research community can benefit from the tools and insights developed.

Last Modified: 12/05/2024
Modified by: Wei Zhu

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error