Award Abstract # 1464379
CRII: AF: Principled Divide-and-Conquer for Topological Algorithms

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: UNIVERSITY OF CONNECTICUT
Initial Amendment Date: January 23, 2015
Latest Amendment Date: January 23, 2015
Award Number: 1464379
Award Instrument: Standard Grant
Program Manager: Jack S. Snoeyink
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: February 1, 2015
End Date: January 31, 2018 (Estimated)
Total Intended Award Amount: $173,034.00
Total Awarded Amount to Date: $173,034.00
Funds Obligated to Date: FY 2015 = $173,034.00
History of Investigator:
  • Donald Sheehy (Principal Investigator)
    drsheehy@ncsu.edu
Recipient Sponsored Research Office: University of Connecticut
438 WHITNEY RD EXTENSION UNIT 1133
STORRS
CT  US  06269-9018
(860)486-3622
Sponsor Congressional District: 02
Primary Place of Performance: University of Connecticut
371 Fairfield Way, Unit 4155
Storrs
CT  US  06269-4155
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): WNTPS995QBM7
Parent UEI:
NSF Program(s): Algorithmic Foundations
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7929, 7933, 7934, 8552
Program Element Code(s): 779600
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Many data sets undergo transformations either before, after, during the collection process which are most often corrected by smoothing, de-noising, or registration. Topological Data Analysis (TDA) makes it possible to extract robust signatures from data that are invariant to these transformations. However, in TDA, even a relatively small data set can easily blow up to fill memory when considering the space needed for edges, triangles, and other simplices that represent (or estimate) the connectivity of the underlying data. There is a need for fast, parallel, and distributed algorithms that partition the input in a principled way that leads to both strong theoretical guarantees and also practical performance. This project aims to fill this need by combining nested dissection, a well-studied theory from numerical analysis (NA) with persistent homology, the main technique of TDA, with the expectation that both fields will benefit. A potential broader impact of the project is to improve communication between researchers in TDA and NA. The PI will train both undergraduate and graduate researchers and incorporate advanced concepts in combinatorial topology in undergraduate and graduate curricula.

The specific aim is to develop a theory of nested dissection on simplicial complexes that allows for fast, parallel computation of persistent homology. A second specific aim is to develop and analyze new efficient data representations for the partially reduced simplicial complexes that appear in the course of persistent homology computation. This will involve a topological generalization of the Union-Find problem, a new approach to combining persistent homology and discrete Morse theory, an extension of the theory of nested dissection to work directly over quotient vector spaces (such as homology groups over fields), and also a separator theory that applies to filtrations or other situations where the underlying graph or complex is changing in time. Preliminary examples indicate that these extensions may produce significantly better theoretical guarantees. A third specific aim is to implement this approach, compare it with, and possibly integrate it with existing open source codes.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Donald R. Sheehy "An Output-Sensitive Algorithm for Computing Weighted ?-Complexes" The Canadian Conference in Computational Geometry , 2015
Kirk P. Gardner and Donald R. Sheehy "kth Nearest Neighbor Sampling in the Plane" The Canadian Conference in Computational Geometry , 2016
Michael Kerber, Donald Sheehy, and Primoz Skraba, "Persistent Homology and Nested Dissection" ACM Symposium on Discrete Algorithms , 2016
Nicholas J. Cavanna and Donald R. Sheehy "Adaptive Metrics for Adaptive Samples" The Canadian Conference in Computational Geometry , 2016
Nicholas J. CavannaKirk P. GardnerDonald R. Sheehy "When and Why the Topological Coverage Criterion Works" SIAM Symposium on Discrete Algorithms , 2017
Nicholas J. Cavanna, Mahmoodreza Jahanseir and Donald R. Sheehy "A Geometric Perspective on Sparse Filtrations" The Canadian Conference in Computational Geometry , 2015
Nicholas J. Cavanna, Mahmoodreza Jahanseir and Donald R. Sheehy "Visualizing Sparse Filtrations" Symposium on Computational Geometry (Multimedia Session) , 2015

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The research from this project is directed at developing efficient new algorithms and data structures in the growing field of topological data analysis.  The field addresses the challenges of organizing and extracting global information from complex, high-dimensional data sets.  One of the main techniques in this field is called persistent homology, a way of using linear algebra to compute topological invariants (properties unchanged by continuous invertible transformations) of data sets that have several advantages; they can be used as a similarity measure between entire data sets and they represent structure at different scales.


The primary goal of this project was to integrate ideas from several different fields in order to improve the state of the art algorithms for computing persistent homology.  The first major outcome was a way to use the theory of nested dissection, a divide-and-conquer technique pioneered in numerical analysis, to speed up the worst-case running time when the input data sets are geometric.  This was a novel integration of the geometry and the algebra, because existing lower bounds imply it is highly unlikely that similar improvements can be made on the purely algebraic version of the problem that all previous algorithms have focused on.


Another outcome of this project was the development of new geometric sampling theories and new efficient constructions of so-called sparse filtrations.  These provide the two layers of inputs to our topological algorithms, first the point sets and then the connectivity structure on top of those points in the form of edges, triangles, etc.


The project had broader impacts beyond research as several of the concepts developed in the proposal were integrated into an undergraduate course, and several undergraduate honors projects.  The students produced a high-quality codebase for generating geometric complexes that is available as open source.  Another undergraduate working on the project got their first experience with research and developed a software testbed for different topological algorithms.  A 3D visualization of one of our algorithms that was produced as part of this project was very effective in communicating these ideas to a wider audience of researchers from disparate areas.


Last Modified: 05/01/2018
Modified by: Donald Sheehy

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page