Award Abstract # 1620082
Accurate Preconditioing for Computing Eigenvalues of Large and Extremely Ill-conditioned Matrices

NSF Org: DMS
Division Of Mathematical Sciences
Recipient: UNIVERSITY OF KENTUCKY RESEARCH FOUNDATION, THE
Initial Amendment Date: June 15, 2016
Latest Amendment Date: June 19, 2018
Award Number: 1620082
Award Instrument: Continuing Grant
Program Manager: Leland Jameson
DMS
 Division Of Mathematical Sciences
MPS
 Directorate for Mathematical and Physical Sciences
Start Date: September 1, 2016
End Date: August 31, 2020 (Estimated)
Total Intended Award Amount: $225,000.00
Total Awarded Amount to Date: $225,000.00
Funds Obligated to Date: FY 2016 = $58,944.00
FY 2017 = $81,684.00

FY 2018 = $84,372.00
History of Investigator:
  • Qiang Ye (Principal Investigator)
    qye3@uky.edu
Recipient Sponsored Research Office: University of Kentucky Research Foundation
500 S LIMESTONE
LEXINGTON
KY  US  40526-0001
(859)257-9420
Sponsor Congressional District: 06
Primary Place of Performance: University of Kentucky Research Foundation
500 S Limestone 109 Kinkead Hall
Lexington
KY  US  40526-0001
Primary Place of Performance
Congressional District:
06
Unique Entity Identifier (UEI): H1HYA8Z1NTM5
Parent UEI:
NSF Program(s): COMPUTATIONAL MATHEMATICS
Primary Program Source: 01001617DB NSF RESEARCH & RELATED ACTIVIT
01001718DB NSF RESEARCH & RELATED ACTIVIT

01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 9150, 9263
Program Element Code(s): 127100
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.049

ABSTRACT

Computations of eigenvalues of large matrices arise in a wide range of scientific and engineering applications, including, for example, page ranking of the Google Search Engine. Large scale eigenvalue problems are often inherently ill-conditioned which implies that their eigenvalues differ vastly in magnitude. This poses a significant challenge to the existing eigenvalue algorithms in the sense that smaller eigenvalues computed may have a poor accuracy, caused by roundoff errors in computer arithmetic. This project will develop new algorithms to address this numerical difficulty. The research results will have applications in a variety of problems where extreme ill-conditioning arises. In particular, a notable ill-conditioning problem is the biharmonic differential operator, which has been used in modeling and design of rigid elastic structures such as beams, plates, or solids, in constructions of multivariate splines, as well as in geometric modeling and computer graphics. A discrete version of the biharmonic operator has also found applications in circuits, image processing, mesh deformation, and manifold learning. With the discretized biharmonic operators easily becoming extremely ill-conditioned, this research will resolve the numerical accuracy issues of the existing algorithms for these applications.

Computing smaller eigenvalues of large and extremely ill-conditioned matrices is an important and intellectually challenging task. Indeed, the effect of ill-conditioning on accuracy is often regarded as an unsolvable problem that is attributable to the formulation of the eigenvalue problem itself. While recent research results have shown that this may be mitigated by exploring structures of matrices, the main objective of this project is to propose an innovative use of preconditioning as a new general methodology to solve the accuracy issue caused by ill-conditioning. We will develop new methods that combine preconditioning with accurate structured inversion methods to accurately compute smaller eigenvalues of an extremely ill-conditioned matrix. As an application, we will also study various discretization schemes and derive suitable structured preconditioners for biharmonic differential operators.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 11)
D. Willmott, D. Murrugarra and Q. Ye "Improving RNA secondary structure prediction via state inference with deep recurrent neural networks" Computational and Mathematical Biophysics , v.8 , 2020 , p.36 10.1515/cmb-2020-0002
Hao Wang and Qiang Ye "Error Bounds for the Krylov Subspace Methods for Computations of Matrix Exponentials" SIAM. J. Matrix Anal. & Appl. , v.38 , 2017 , p.155 10.1137/16M1063733
Helfrich, Kyle and Ye, Qiang "Eigenvalue Normalized Recurrent Neural Networks for Short Term Memory" Proceedings of the AAAI Conference on Artificial Intelligence , v.34 , 2020 10.1609/aaai.v34i04.5831 Citation Details
K.D. Maduranga, K. Helfrich, and Q. Ye "Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform" Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence , v.33 , 2019 , p.4528 2374-3468
K. Helfrich and Q. Ye "Eigenvalue normalized recurrent neural networks for short term memory" AAAI , 2020 , p.4115 2159-5399
Kyle Helfrich, Devin Willmott, Qiang Ye "Orthogonal Recurrent Neural Networks with Scaled Cayley Transform" PMLR , v.80 , 2018 , p.1969
Maduranga, Kehelwala D. and Helfrich, Kyle E. and Ye, Qiang "Complex Unitary Recurrent Neural Networks Using Scaled Cayley Transform" Proceedings of the AAAI Conference on Artificial Intelligence , v.33 , 2019 10.1609/aaai.v33i01.33014528 Citation Details
P. Guo and Q. Ye "On the regularization of convolutional kernel tensors in neural networks" Linear and Multilinear Algebra , 2020 10.1080/03081087.2020.1795058
Qiang Ye "Accurate inverses for computing eigenvalues of extremely ill-conditioned matrices and differential operators" Math. Comp. , v.87 , 2018 , p.237 10.1090/mcom/3223
Qiang Ye "Preconditioning for Accurate Solutions of Ill-conditioned Linear Systems" Numerical Linear Algebra with Applications , v.27 , 2020 , p.e2315 10.1002/nla.2315
Willmott, Devin and Murrugarra, David and Ye, Qiang "Improving RNA secondary structure prediction via state inference with deep recurrent neural networks" Computational and Mathematical Biophysics , v.8 , 2020 10.1515/cmb-2020-0002 Citation Details
(Showing: 1 - 10 of 11)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Intellectual Merit:

Accuracy of traditional numerical algorithms for solving a linear system or computing eigenvalues of a matrix is limited by the condition number of the matrix. In this project, we have developed an accurate preconditioning algorithm together with an error analysis to solve an ill-conditioned linear algebra problem with an accuracy independent of the condition number, provided there is an accurate inversion algorithm for the preconditioner. The algorithm allows us to solve some ill-conditioned linear systems to the accuracy of machine precision regardless of the condition number. The accurate solution of linear systems are used to compute eigenvalues to high relative accuracy. Numerical tests have confirmed the stability of the method developed. As part of the analysis, we have introduced a new stability concept, called inverse-equivalent accuracy, that is fundamental in precisely characterizing the stability gained.

We have studied applications of preconditioning to neural network optimization by developing a novel preconditioning method, called batch normalization preconditioning, to accelerate convergence of neural network training. Our algorithm is rooted in a very effective neural network training method called Batch Normalization (BN), but is derived from a well-established theoretical framework of preconditioning. We have developed a theory to show the batch normalization preconditioning reduces the condition number of Hessian matrix. Our theory is also used to derive a BN method for Convolutional Neural Networks. Extensive experiments have been carried out that demonstrate that our method is highly competitive to BN and, in particular, works with small mini-batches whereas BN does not. We have also developed a new optimization algorithm with a convergence theory that combines the classical nonlinear conjugate gradient with the momentum methods, called adaptive momentum method. Our method has the convergence characteristic of the nonlinear conjugate gradient without the need for line search at each step, which is important in large scale optimization problems such as neural networks. We have applied the adaptive momentum method to convolutional neural network training for the image classification problems, which shows significant improvements over the classical momentum method as well as the Nesterov momentum method.

We have studied stable constructions of orthogonal or unitary matrices in recurrent neural networks (RNNs). We have developed two new methods to construct a recurrent matrix that is orthogonal/unitary to mitigate the exploding/vanishing gradient problem. One method is for real orthogonal matrix and the other is for complex unitary matrix. Both of our methods have outperformed the standard RNNs and their variants on a variety of benchmark testing problems. We have further extended the methods to allow construction of a recurrent matrix with a 2x2 block upper triangular matrix with the diagonal blocks consisting of an orthogonal matrix and a matrix with the spectral radius less than 1. This remedies some disadvantages of an orthogonal recurrent matrix and has led to much improved performance. Our methods have produced state-of-the-art results in several benchmark datasets.

We have proposed a new optimization algorithm for generative adversarial networks (GANs) by introducing an adaptive weighting scheme to calibrate simultaneous optimizations of two losses in GAN's discriminator. The methods have produced some very competitive results, including state-of-the-arts in inception scores for several benchmark datasets.

We have derived conditions on a convolution kernel that can generate or preserve symmetric structures in features. We have applied our method to the sequential recommendation problem and to the RNA secondary structure inference problem. Our experimental results demonstrate that our approaches can improve inference results while using equal or smaller numbers of machine parameters.

Broader Impact:

We have applied the accurate preconconditioning method to compute a few smallest eigenvalues of a finite difference discretization of the biharmonic operator with an accuracy in the order of machine precision. This was not possible without using higher precision by existing methods. The biharmonic eigenvalue problems arise in many applications.

We have collaborated with colleagues in mathematical biology and bioinformatics to study two applications of RNNs. First, we have successfully derived an LSTM method (a variation of RNN) for the state inference of RNA secondary structure that significantly outperforms traditional methods in classification accuracy. Second, we have derived an LSTM implementation for automatic detections of epileptic seizures through analysis of the electroencephalography (EEG) signals and our method has better performance compared with the current state-of-the-art methods and is more robust across patients. Finally, four computer codes derived from this project have been freely distributed at the open source platform GitHub.


Last Modified: 10/01/2020
Modified by: Qiang Ye

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page