
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | August 3, 2017 |
Latest Amendment Date: | August 3, 2017 |
Award Number: | 1741264 |
Award Instrument: | Standard Grant |
Program Manager: |
Sylvia Spengler
sspengle@nsf.gov (703)292-7347 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2017 |
End Date: | August 31, 2021 (Estimated) |
Total Intended Award Amount: | $181,060.00 |
Total Awarded Amount to Date: | $181,060.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
1834 WAKE FOREST RD WINSTON SALEM NC US 27109-6000 (336)758-5888 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
1834 Wake Forest Road PNB 6222 Winston-Salem NC US 27109-8758 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Big Data Science &Engineering |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Genomes contain the complete set of instructions for building an organism. Structural variants are rearrangements in the genome such as insertions and deletions, whose discovery advances the understanding of the evolution and the adaptability of species. Recent advances in high-throughput sequencing technologies have led to the collection of vast quantities of genomic data. Because of this, fast and robust algorithms are needed to identify structural variants, which are rare and are prone to noise. This research will contribute fundamentally to optimization methods for large-scale problems in computational genomics. The algorithms will be disseminated publicly for use within and outside the biology, mathematics, and computer science community. Graduate students will be trained in scientific research and programming through this interdisciplinary research, and the participation of students from under-represented backgrounds will be highly encouraged.
The research objective of this award is to develop computational tools for large-scale data-driven problems arising in computational genomics. These problems are especially difficult to solve since they are high-dimensional and the data are noisy and inexact. This study will take advantage of known relationships in sequenced genomes to improve the accuracy of identifying genomic variants in population studies when there is both low coverage in the data and multiple related individuals are sequenced. Specifically, the proposed research will (i) explore statistical models for describing the presence of structural variants in genomes, (ii) develop and implement novel sparse optimization methods for genomic structural variant detection, and (iii) validate on existing genomic data sets and predict on new data.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Outcomes of this award include new computational tools for large-scale data-driven problems arising in computational genomics. These tools advance the state-of-the-art algorithms for solving optimization problems using quasi-Newton methods and multipoint symmetric secant methods. Quasi-Newton methods are ideal for large-scale optimization since they are able to solve problems making use of only first-order (i.e., first derivative) information, avoiding the prohibitive computational cost and storage needs of computing second derivatives. Specifically, intellectual merit outcomes of this research include (i) extending the so-called "compact representation" of the "convex" Broyden class of quasi-Newton matrices to the full class, (ii) incorporating quasi-Newton methods designed for nonconvex problems into a new trust-region framework, and (iii) developing a new "dense" initialization for any quasi-Newton method whose Hessian approximation admits a compact representation. Numerical results suggest that (ii) and (iii) yield algorithms that outperform other solvers, including traditional quasi-Newton methods. This work is currently being extended to develop new interior-point methods to solve optimization problems with constraints.
Multipoint symmetric secant (MSS) methods were first described in the 1970s. These methods can be thought of as generalizations of quasi-Newton methods. This grant work included extending the "dense" initialization for quasi-Newton methods to MSS methods, yielding improved numerical results on a standard test set for large-scale optimization. Ongoing work explores different types of MSS methods other than those found in the current literature with the goal of using these methods to solve both unconstrained and constrained optimization problems.
Broader impacts include the training and mentoring of graduate students, the incorporation of cutting-edge research into mathematics and computer science courses, and the wide dissemination of results. Specifically, this award supported two masters-level graduate students gain experience in computational mathematics and programming. Subsequent to grant work, both students applied to doctoral programs. Also, select results of this research were incorporated into lectures for a nonlinear optimization class and a topics in nonlinear optimization class in Spring 2019 and Spring 2020, respectively. Finally, results from this research were disseminated via conference and seminar talks, a webpage devoted to this award, and journal articles.
Last Modified: 12/03/2021
Modified by: Jennifer B Erway
Please report errors in award information by writing to: awardsearch@nsf.gov.