
NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | June 14, 2018 |
Latest Amendment Date: | June 14, 2018 |
Award Number: | 1821144 |
Award Instrument: | Standard Grant |
Program Manager: |
Christopher Stark
DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences |
Start Date: | September 1, 2018 |
End Date: | August 31, 2022 (Estimated) |
Total Intended Award Amount: | $200,000.00 |
Total Awarded Amount to Date: | $200,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
500 S LIMESTONE LEXINGTON KY US 40526-0001 (859)257-9420 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
500 S Limestone 109 Kinkead Hall Lexington KY US 40526-0001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CDS&E-MSS |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
Deep neural networks have emerged over the last decade as one of the most powerful machine learning methods. Recurrent neural networks (RNNs) are special neural networks that are designed to efficiently model sequential data such as speech and text data by exploiting temporal connections within a sequence and handling varying sequence lengths in a dataset. While RNN and its variants have found success in many real-world applications, there are various issues that make them difficult to use in practice. This project will systematically address some of these difficulties and develop an efficient and robust RNN. Computer codes derived in this project will be made freely available. The research results will have applications in a variety of areas involving sequential data learning, including computer vision, speech recognition, natural language processing, financial data analysis, and bioinformatics.
As in other neural networks, training of RNNs typically involves some variants of gradient descent optimization, which is prone to so-called vanishing or exploding gradient problems. Regularization of RNNs, which refers to techniques used to prevent the model from overfitting the raining data and hence poor generalization to new data, is also challenging. The current preferred RNN architectures such as the Long-Short-Term-Memory networks have highly complex structures with numerous additional interacting elements that are not easy to understand. This project develops an RNN that extends a recent orthogonal/unitary RNNs to more effectively model long and short term dependency of sequential data. Through an indirect parametrization of recurrent matrix, dropout regularization techniques will be developed. The network developed in this project will retain the simplicity and efficiency of basic RNNs but enhance some key capabilities for robust applications. In particular, the project will include a study of applications of RNNs to some bioinformatics problems.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Intellectual Merit:
This project systematically studies several challenges arising in training of a recurrent neural network (RNN), a fundamental model for sequential data such as speech and text data. Several new RNN models have been developed to address vanishing or exploding gradient problems in training of RNNs as well as to effectively carry short term and long term memory of the networks. They include a scaled Cayley unitary recurrent neural network (scuRNN), eigenvalue normalized recurrent neural network (ENRNN), and an orthogonal recurrent gated unit through Neumann-Cayley transform (NC-GRU).
The project has also investigated new optimization techniques and new model architectures to accelerate and stabilize training of neural networks. A new class of preconditioning methods, called batch normalization preconditioning, have been developed for fully-connected networks and convolutional neural networks that can significantly accelerates the training. A conjugate gradient momentum method has been proposed to address the difficulty of selecting momentum parameters in momentum methods. An adaptive weighted discriminator loss function has been developed to stabilize the training of generative adversarial neural networks (GANs). A new integral-based approach has been introduced as a more versatile and efficient way to construct normalizing flows. A convolutional neural network with suitable structures has been derived to produce symmetric feature maps that can better model tasks with such a structure.
Broader Impact:
The project has implemented some new RNN models for three applications: the state inference of RNA secondary structure, automatic detections of epileptic seizures through analysis of the electroencephalography signals, and molecular property predictions in drug design. It has further developed a GAN model with adaptive weighted discriminator and a consistency loss functions for speech enhancements. New state-of-the-art results have been obtained for several datasets in the tasks of molecular property predictions and speech enhancements.
The project has resulted in seven computer codes that are freely distributed at the open source platform GitHub. It has also produced three Ph.D. dissertations directed by PI.
Last Modified: 12/22/2022
Modified by: Qiang Ye
Please report errors in award information by writing to: awardsearch@nsf.gov.