Award Abstract # 1821144
CDS&E: Efficient and Robust Recurrent Neural Networks

NSF Org: DMS
Division Of Mathematical Sciences
Recipient: UNIVERSITY OF KENTUCKY RESEARCH FOUNDATION, THE
Initial Amendment Date: June 14, 2018
Latest Amendment Date: June 14, 2018
Award Number: 1821144
Award Instrument: Standard Grant
Program Manager: Christopher Stark
DMS
 Division Of Mathematical Sciences
MPS
 Directorate for Mathematical and Physical Sciences
Start Date: September 1, 2018
End Date: August 31, 2022 (Estimated)
Total Intended Award Amount: $200,000.00
Total Awarded Amount to Date: $200,000.00
Funds Obligated to Date: FY 2018 = $200,000.00
History of Investigator:
  • Qiang Ye (Principal Investigator)
    qye3@uky.edu
Recipient Sponsored Research Office: University of Kentucky Research Foundation
500 S LIMESTONE
LEXINGTON
KY  US  40526-0001
(859)257-9420
Sponsor Congressional District: 06
Primary Place of Performance: University of Kentucky Research Foundation
500 S Limestone 109 Kinkead Hall
Lexington
KY  US  40526-0001
Primary Place of Performance
Congressional District:
06
Unique Entity Identifier (UEI): H1HYA8Z1NTM5
Parent UEI:
NSF Program(s): CDS&E-MSS
Primary Program Source: 01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 8083, 9150, 9263
Program Element Code(s): 806900
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.049

ABSTRACT

Deep neural networks have emerged over the last decade as one of the most powerful machine learning methods. Recurrent neural networks (RNNs) are special neural networks that are designed to efficiently model sequential data such as speech and text data by exploiting temporal connections within a sequence and handling varying sequence lengths in a dataset. While RNN and its variants have found success in many real-world applications, there are various issues that make them difficult to use in practice. This project will systematically address some of these difficulties and develop an efficient and robust RNN. Computer codes derived in this project will be made freely available. The research results will have applications in a variety of areas involving sequential data learning, including computer vision, speech recognition, natural language processing, financial data analysis, and bioinformatics.

As in other neural networks, training of RNNs typically involves some variants of gradient descent optimization, which is prone to so-called vanishing or exploding gradient problems. Regularization of RNNs, which refers to techniques used to prevent the model from overfitting the raining data and hence poor generalization to new data, is also challenging. The current preferred RNN architectures such as the Long-Short-Term-Memory networks have highly complex structures with numerous additional interacting elements that are not easy to understand. This project develops an RNN that extends a recent orthogonal/unitary RNNs to more effectively model long and short term dependency of sequential data. Through an indirect parametrization of recurrent matrix, dropout regularization techniques will be developed. The network developed in this project will retain the simplicity and efficiency of basic RNNs but enhance some key capabilities for robust applications. In particular, the project will include a study of applications of RNNs to some bioinformatics problems.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 12)
Alkilayh, Maged and Reichel, Lothar and Ye, Qiang "A method for computing a few eigenpairs of large generalized eigenvalue problems" Applied Numerical Mathematics , v.183 , 2023 https://doi.org/10.1016/j.apnum.2022.08.018 Citation Details
Cai, D. and Ji, Y. and He, H. and Ye, Q. "AUTM Flow: Atomic Unrestricted Time Machine for Monotonic Normalizing Flows" Uncertainty in artificial intelligence , 2022 Citation Details
Guo, Pei-Chang and Ye, Qiang "On the regularization of convolutional kernel tensors in neural networks" Linear and Multilinear Algebra , 2020 https://doi.org/10.1080/03081087.2020.1795058 Citation Details
Helfrich, Kyle and Ye, Qiang "Eigenvalue Normalized Recurrent Neural Networks for Short Term Memory" Proceedings of the AAAI Conference on Artificial Intelligence , v.34 , 2020 10.1609/aaai.v34i04.5831 Citation Details
Kosta, Sarah and Colli, Dylan and Ye, Qiang and Campbell, Kenneth S. "FiberSim: A flexible open-source model of myofilament-level contraction" Biophysical Journal , v.121 , 2022 https://doi.org/10.1016/j.bpj.2021.12.021 Citation Details
Lange, S. and Helfrich, K. and Ye, Q. "Batch Normalization Preconditioning for Neural Network Training" Journal of machine learning research , v.23 , 2022 Citation Details
Maduranga, Kehelwala D. and Helfrich, Kyle E. and Ye, Qiang "Complex Unitary Recurrent Neural Networks Using Scaled Cayley Transform" Proceedings of the AAAI Conference on Artificial Intelligence , v.33 , 2019 10.1609/aaai.v33i01.33014528 Citation Details
Willmott, Devin and Murrugarra, David and Ye, Qiang "Improving RNA secondary structure prediction via state inference with deep recurrent neural networks" Computational and Mathematical Biophysics , v.8 , 2020 10.1515/cmb-2020-0002 Citation Details
Yao, Xinghua and Li, Xiaojin and Ye, Qiang and Huang, Yan and Cheng, Qiang and Zhang, Guo-Qiang "A robust deep learning approach for automatic classification of seizures against non-seizures" Biomedical Signal Processing and Control , v.64 , 2021 https://doi.org/10.1016/j.bspc.2020.102215 Citation Details
Ye, Qiang "Preconditioning for accurate solutions of illconditioned linear systems" Numerical Linear Algebra with Applications , v.27 , 2020 https://doi.org/10.1002/nla.2315 Citation Details
Zadorozhnyy, Vasily and Cheng, Qiang and Ye, Qiang "Adaptive Weighted Discriminator for Training Generative Adversarial Networks" IEEE Conference on Computer Vision and Pattern Recognition , 2021 Citation Details
(Showing: 1 - 10 of 12)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Intellectual Merit:

This project systematically studies several challenges arising in training of a recurrent neural network (RNN), a fundamental model for sequential data such as speech and text data. Several new RNN models have been developed to address vanishing or exploding gradient problems in training of RNNs as well as to effectively carry short term and long term memory of the networks. They include a scaled Cayley unitary recurrent neural network (scuRNN), eigenvalue normalized recurrent neural network (ENRNN), and an orthogonal recurrent gated unit through Neumann-Cayley transform (NC-GRU).

The project has also investigated new optimization techniques and new model architectures to accelerate and stabilize training of neural networks. A new class of preconditioning methods, called batch normalization preconditioning, have been developed for fully-connected networks and convolutional neural networks that can significantly accelerates the training. A conjugate gradient momentum method has been proposed to address the difficulty of selecting momentum parameters in momentum methods. An adaptive weighted discriminator loss function has been developed to stabilize the training of generative adversarial neural networks (GANs). A new integral-based approach has been introduced as a more versatile and efficient way to construct normalizing flows. A convolutional neural network with suitable structures has been derived to produce symmetric feature maps that can better model tasks with such a structure.

Broader Impact:

The project has implemented some new RNN models for three applications: the state inference of RNA secondary structure, automatic detections of epileptic seizures through analysis of the electroencephalography signals, and molecular property predictions in drug design. It has further developed a GAN model with adaptive weighted discriminator and a consistency loss functions for speech enhancements. New state-of-the-art results have been obtained for several datasets in the tasks of molecular property predictions and speech enhancements.

The project has resulted in seven computer codes that are freely distributed at the open source platform GitHub. It has also produced three Ph.D. dissertations directed by PI.


Last Modified: 12/22/2022
Modified by: Qiang Ye

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page