
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | January 23, 2015 |
Latest Amendment Date: | January 23, 2015 |
Award Number: | 1464379 |
Award Instrument: | Standard Grant |
Program Manager: |
Jack S. Snoeyink
CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | February 1, 2015 |
End Date: | January 31, 2018 (Estimated) |
Total Intended Award Amount: | $173,034.00 |
Total Awarded Amount to Date: | $173,034.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
438 WHITNEY RD EXTENSION UNIT 1133 STORRS CT US 06269-9018 (860)486-3622 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
371 Fairfield Way, Unit 4155 Storrs CT US 06269-4155 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Algorithmic Foundations |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Many data sets undergo transformations either before, after, during the collection process which are most often corrected by smoothing, de-noising, or registration. Topological Data Analysis (TDA) makes it possible to extract robust signatures from data that are invariant to these transformations. However, in TDA, even a relatively small data set can easily blow up to fill memory when considering the space needed for edges, triangles, and other simplices that represent (or estimate) the connectivity of the underlying data. There is a need for fast, parallel, and distributed algorithms that partition the input in a principled way that leads to both strong theoretical guarantees and also practical performance. This project aims to fill this need by combining nested dissection, a well-studied theory from numerical analysis (NA) with persistent homology, the main technique of TDA, with the expectation that both fields will benefit. A potential broader impact of the project is to improve communication between researchers in TDA and NA. The PI will train both undergraduate and graduate researchers and incorporate advanced concepts in combinatorial topology in undergraduate and graduate curricula.
The specific aim is to develop a theory of nested dissection on simplicial complexes that allows for fast, parallel computation of persistent homology. A second specific aim is to develop and analyze new efficient data representations for the partially reduced simplicial complexes that appear in the course of persistent homology computation. This will involve a topological generalization of the Union-Find problem, a new approach to combining persistent homology and discrete Morse theory, an extension of the theory of nested dissection to work directly over quotient vector spaces (such as homology groups over fields), and also a separator theory that applies to filtrations or other situations where the underlying graph or complex is changing in time. Preliminary examples indicate that these extensions may produce significantly better theoretical guarantees. A third specific aim is to implement this approach, compare it with, and possibly integrate it with existing open source codes.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The research from this project is directed at developing efficient new algorithms and data structures in the growing field of topological data analysis. The field addresses the challenges of organizing and extracting global information from complex, high-dimensional data sets. One of the main techniques in this field is called persistent homology, a way of using linear algebra to compute topological invariants (properties unchanged by continuous invertible transformations) of data sets that have several advantages; they can be used as a similarity measure between entire data sets and they represent structure at different scales.
The primary goal of this project was to integrate ideas from several different fields in order to improve the state of the art algorithms for computing persistent homology. The first major outcome was a way to use the theory of nested dissection, a divide-and-conquer technique pioneered in numerical analysis, to speed up the worst-case running time when the input data sets are geometric. This was a novel integration of the geometry and the algebra, because existing lower bounds imply it is highly unlikely that similar improvements can be made on the purely algebraic version of the problem that all previous algorithms have focused on.
Another outcome of this project was the development of new geometric sampling theories and new efficient constructions of so-called sparse filtrations. These provide the two layers of inputs to our topological algorithms, first the point sets and then the connectivity structure on top of those points in the form of edges, triangles, etc.
The project had broader impacts beyond research as several of the concepts developed in the proposal were integrated into an undergraduate course, and several undergraduate honors projects. The students produced a high-quality codebase for generating geometric complexes that is available as open source. Another undergraduate working on the project got their first experience with research and developed a software testbed for different topological algorithms. A 3D visualization of one of our algorithms that was produced as part of this project was very effective in communicating these ideas to a wider audience of researchers from disparate areas.
Last Modified: 05/01/2018
Modified by: Donald Sheehy
Please report errors in award information by writing to: awardsearch@nsf.gov.