Skip to feedback

Award Abstract # 1741264
BIGDATA: IA: Collaborative Research: Parsimonious Anomaly Detection in Sequencing Data

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: WAKE FOREST UNIVERSITY
Initial Amendment Date: August 3, 2017
Latest Amendment Date: August 3, 2017
Award Number: 1741264
Award Instrument: Standard Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2017
End Date: August 31, 2021 (Estimated)
Total Intended Award Amount: $181,060.00
Total Awarded Amount to Date: $181,060.00
Funds Obligated to Date: FY 2017 = $181,060.00
History of Investigator:
  • Jennifer Erway (Principal Investigator)
    erwayjb@wfu.edu
Recipient Sponsored Research Office: Wake Forest University
1834 WAKE FOREST RD
WINSTON SALEM
NC  US  27109-6000
(336)758-5888
Sponsor Congressional District: 05
Primary Place of Performance: wake forest university
1834 Wake Forest Road PNB 6222
Winston-Salem
NC  US  27109-8758
Primary Place of Performance
Congressional District:
05
Unique Entity Identifier (UEI): MBU6HCLNZ431
Parent UEI:
NSF Program(s): Big Data Science &Engineering
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7433, 8083, 9102
Program Element Code(s): 808300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Genomes contain the complete set of instructions for building an organism. Structural variants are rearrangements in the genome such as insertions and deletions, whose discovery advances the understanding of the evolution and the adaptability of species. Recent advances in high-throughput sequencing technologies have led to the collection of vast quantities of genomic data. Because of this, fast and robust algorithms are needed to identify structural variants, which are rare and are prone to noise. This research will contribute fundamentally to optimization methods for large-scale problems in computational genomics. The algorithms will be disseminated publicly for use within and outside the biology, mathematics, and computer science community. Graduate students will be trained in scientific research and programming through this interdisciplinary research, and the participation of students from under-represented backgrounds will be highly encouraged.

The research objective of this award is to develop computational tools for large-scale data-driven problems arising in computational genomics. These problems are especially difficult to solve since they are high-dimensional and the data are noisy and inexact. This study will take advantage of known relationships in sequenced genomes to improve the accuracy of identifying genomic variants in population studies when there is both low coverage in the data and multiple related individuals are sequenced. Specifically, the proposed research will (i) explore statistical models for describing the presence of structural variants in genomes, (ii) develop and implement novel sparse optimization methods for genomic structural variant detection, and (iii) validate on existing genomic data sets and predict on new data.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Brust, Johannes and Burdakov, Oleg and Erway, Jennifer B. and Marcia, Roummel F. "A dense initialization for limited-memory quasi-Newton methods" Computational Optimization and Applications , v.74 , 2019 https://doi.org/10.1007/s10589-019-00112-x Citation Details
DeGuchy, Omar and Erway, Jennifer B. and Marcia, Roummel F. "Compact representation of the full Broyden class of quasi-Newton updates: Compact Representation of the Full Broyden Class of Quasi-Newton Updates" Numerical Linear Algebra with Applications , v.25 , 2018 https://doi.org/10.1002/nla.2186 Citation Details
Erway, Jennifer B. and Griffin, Joshua and Marcia, Roummel F. and Omheni, Riadh "Trust-region algorithms for training responses: machine learning methods using indefinite Hessian approximations" Optimization Methods and Software , v.35 , 2020 https://doi.org/10.1080/10556788.2019.1624747 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Outcomes of this award include new computational tools for large-scale data-driven problems arising in computational genomics.  These tools advance the state-of-the-art algorithms for solving optimization problems using quasi-Newton methods and multipoint symmetric secant methods.  Quasi-Newton methods are ideal for large-scale optimization since they are able to solve problems making use of only first-order (i.e., first derivative) information, avoiding the prohibitive computational cost and storage needs of computing second derivatives.  Specifically, intellectual merit outcomes of this research include (i) extending the so-called "compact representation" of the "convex" Broyden class of quasi-Newton matrices to the full class, (ii) incorporating quasi-Newton methods designed for nonconvex problems into a new trust-region framework, and (iii) developing a new "dense" initialization for any quasi-Newton method whose Hessian approximation admits a compact representation.  Numerical results suggest that (ii) and (iii) yield algorithms that outperform other solvers, including traditional quasi-Newton methods.  This work is currently being extended to develop new interior-point methods to solve optimization problems with constraints.

 

Multipoint symmetric secant (MSS) methods were first described in the 1970s.  These methods can be thought of as generalizations of quasi-Newton methods.  This grant work included extending the "dense" initialization for quasi-Newton methods to MSS methods, yielding improved numerical results on a standard test set for large-scale optimization.  Ongoing work explores different types of MSS methods other than those found in the current literature with the goal of using these methods to solve both unconstrained and constrained optimization problems.

 

Broader impacts include the training and mentoring of graduate students, the incorporation of cutting-edge research into mathematics and computer science courses, and the wide dissemination of results.  Specifically, this award supported two masters-level graduate students gain experience in computational mathematics and programming.  Subsequent to grant work, both students applied to doctoral programs.  Also, select results of this research were incorporated into lectures for a nonlinear optimization class and a topics in nonlinear optimization class in Spring 2019 and Spring 2020, respectively.  Finally, results from this research were disseminated via conference and seminar talks, a webpage devoted to this award, and journal articles.  

 


Last Modified: 12/03/2021
Modified by: Jennifer B Erway

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page