
NSF Org: |
DMS Division Of Mathematical Sciences |
Recipient: |
|
Initial Amendment Date: | June 15, 2016 |
Latest Amendment Date: | June 19, 2018 |
Award Number: | 1620082 |
Award Instrument: | Continuing Grant |
Program Manager: |
Leland Jameson
DMS Division Of Mathematical Sciences MPS Directorate for Mathematical and Physical Sciences |
Start Date: | September 1, 2016 |
End Date: | August 31, 2020 (Estimated) |
Total Intended Award Amount: | $225,000.00 |
Total Awarded Amount to Date: | $225,000.00 |
Funds Obligated to Date: |
FY 2017 = $81,684.00 FY 2018 = $84,372.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
500 S LIMESTONE LEXINGTON KY US 40526-0001 (859)257-9420 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
500 S Limestone 109 Kinkead Hall Lexington KY US 40526-0001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | COMPUTATIONAL MATHEMATICS |
Primary Program Source: |
01001718DB NSF RESEARCH & RELATED ACTIVIT 01001819DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
Computations of eigenvalues of large matrices arise in a wide range of scientific and engineering applications, including, for example, page ranking of the Google Search Engine. Large scale eigenvalue problems are often inherently ill-conditioned which implies that their eigenvalues differ vastly in magnitude. This poses a significant challenge to the existing eigenvalue algorithms in the sense that smaller eigenvalues computed may have a poor accuracy, caused by roundoff errors in computer arithmetic. This project will develop new algorithms to address this numerical difficulty. The research results will have applications in a variety of problems where extreme ill-conditioning arises. In particular, a notable ill-conditioning problem is the biharmonic differential operator, which has been used in modeling and design of rigid elastic structures such as beams, plates, or solids, in constructions of multivariate splines, as well as in geometric modeling and computer graphics. A discrete version of the biharmonic operator has also found applications in circuits, image processing, mesh deformation, and manifold learning. With the discretized biharmonic operators easily becoming extremely ill-conditioned, this research will resolve the numerical accuracy issues of the existing algorithms for these applications.
Computing smaller eigenvalues of large and extremely ill-conditioned matrices is an important and intellectually challenging task. Indeed, the effect of ill-conditioning on accuracy is often regarded as an unsolvable problem that is attributable to the formulation of the eigenvalue problem itself. While recent research results have shown that this may be mitigated by exploring structures of matrices, the main objective of this project is to propose an innovative use of preconditioning as a new general methodology to solve the accuracy issue caused by ill-conditioning. We will develop new methods that combine preconditioning with accurate structured inversion methods to accurately compute smaller eigenvalues of an extremely ill-conditioned matrix. As an application, we will also study various discretization schemes and derive suitable structured preconditioners for biharmonic differential operators.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Intellectual Merit:
Accuracy of traditional numerical algorithms for solving a linear system or computing eigenvalues of a matrix is limited by the condition number of the matrix. In this project, we have developed an accurate preconditioning algorithm together with an error analysis to solve an ill-conditioned linear algebra problem with an accuracy independent of the condition number, provided there is an accurate inversion algorithm for the preconditioner. The algorithm allows us to solve some ill-conditioned linear systems to the accuracy of machine precision regardless of the condition number. The accurate solution of linear systems are used to compute eigenvalues to high relative accuracy. Numerical tests have confirmed the stability of the method developed. As part of the analysis, we have introduced a new stability concept, called inverse-equivalent accuracy, that is fundamental in precisely characterizing the stability gained.
We have studied applications of preconditioning to neural network optimization by developing a novel preconditioning method, called batch normalization preconditioning, to accelerate convergence of neural network training. Our algorithm is rooted in a very effective neural network training method called Batch Normalization (BN), but is derived from a well-established theoretical framework of preconditioning. We have developed a theory to show the batch normalization preconditioning reduces the condition number of Hessian matrix. Our theory is also used to derive a BN method for Convolutional Neural Networks. Extensive experiments have been carried out that demonstrate that our method is highly competitive to BN and, in particular, works with small mini-batches whereas BN does not. We have also developed a new optimization algorithm with a convergence theory that combines the classical nonlinear conjugate gradient with the momentum methods, called adaptive momentum method. Our method has the convergence characteristic of the nonlinear conjugate gradient without the need for line search at each step, which is important in large scale optimization problems such as neural networks. We have applied the adaptive momentum method to convolutional neural network training for the image classification problems, which shows significant improvements over the classical momentum method as well as the Nesterov momentum method.
We have studied stable constructions of orthogonal or unitary matrices in recurrent neural networks (RNNs). We have developed two new methods to construct a recurrent matrix that is orthogonal/unitary to mitigate the exploding/vanishing gradient problem. One method is for real orthogonal matrix and the other is for complex unitary matrix. Both of our methods have outperformed the standard RNNs and their variants on a variety of benchmark testing problems. We have further extended the methods to allow construction of a recurrent matrix with a 2x2 block upper triangular matrix with the diagonal blocks consisting of an orthogonal matrix and a matrix with the spectral radius less than 1. This remedies some disadvantages of an orthogonal recurrent matrix and has led to much improved performance. Our methods have produced state-of-the-art results in several benchmark datasets.
We have proposed a new optimization algorithm for generative adversarial networks (GANs) by introducing an adaptive weighting scheme to calibrate simultaneous optimizations of two losses in GAN's discriminator. The methods have produced some very competitive results, including state-of-the-arts in inception scores for several benchmark datasets.
We have derived conditions on a convolution kernel that can generate or preserve symmetric structures in features. We have applied our method to the sequential recommendation problem and to the RNA secondary structure inference problem. Our experimental results demonstrate that our approaches can improve inference results while using equal or smaller numbers of machine parameters.
Broader Impact:
We have applied the accurate preconconditioning method to compute a few smallest eigenvalues of a finite difference discretization of the biharmonic operator with an accuracy in the order of machine precision. This was not possible without using higher precision by existing methods. The biharmonic eigenvalue problems arise in many applications.
We have collaborated with colleagues in mathematical biology and bioinformatics to study two applications of RNNs. First, we have successfully derived an LSTM method (a variation of RNN) for the state inference of RNA secondary structure that significantly outperforms traditional methods in classification accuracy. Second, we have derived an LSTM implementation for automatic detections of epileptic seizures through analysis of the electroencephalography (EEG) signals and our method has better performance compared with the current state-of-the-art methods and is more robust across patients. Finally, four computer codes derived from this project have been freely distributed at the open source platform GitHub.
Last Modified: 10/01/2020
Modified by: Qiang Ye
Please report errors in award information by writing to: awardsearch@nsf.gov.