Award Abstract # 1815145
AF: Small: Approximation Algorithms for Learning Metric Spaces

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: UNIVERSITY OF ILLINOIS
Initial Amendment Date: May 24, 2018
Latest Amendment Date: May 24, 2018
Award Number: 1815145
Award Instrument: Standard Grant
Program Manager: Peter Brass
pbrass@nsf.gov
 (703)292-2182
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 1, 2018
End Date: May 31, 2023 (Estimated)
Total Intended Award Amount: $399,899.00
Total Awarded Amount to Date: $399,899.00
Funds Obligated to Date: FY 2018 = $399,899.00
History of Investigator:
  • Anastasios Sidiropoulos (Principal Investigator)
    Sidiropo@uic.edu
Recipient Sponsored Research Office: University of Illinois at Chicago
809 S MARSHFIELD AVE M/C 551
CHICAGO
IL  US  60612-4305
(312)996-2862
Sponsor Congressional District: 07
Primary Place of Performance: University of Illinois at Chicago
IL  US  60612-4305
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): W8XEAJDKMXH3
Parent UEI:
NSF Program(s): Algorithmic Foundations
Primary Program Source: 01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7926, 7923, 7929
Program Element Code(s): 779600
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The analysis of large and complicated data sets is a task of increasing importance in several brunches of science and engineering. In many application scenarios, the data consists of a set of objects endowed with some information that quantifies similarity or dissimilarity of certain pairs. Examples of such inputs include statistical distributions, user preferences in social networks, DNA sequences, and so on. A natural way to interpret such data sets is as metric spaces. That is, each object is treated as a point, and the dissimilarity between two objects is encoded by their distance. The successful application of this data-analytic framework requires finding a distance function that faithfully encodes the ground truth. The project seeks to develop new algorithms for computing such faithful metrical representations of data sets. The underlying research problems lead to new connections between the areas of computational geometry, algorithm design, and machine learning. The project aims at transferring ideas between these scientific communities, and developing new methods for data analysis in practice. The project will be part of the investigator's curriculum development, teaching, and educational and outreach activities. The main research problems that the project seeks to study will form the basis for the training of both undergraduate and graduate students. The investigator is committed to working with minorities and underrepresented groups.


This metrical interpretation of data sets has been successful in practice and has lead to a plethora of algorithmic methods and ideas, such as clustering, dimensionality reduction, nearest-neighbor search, and so on. The area of metric learning is concerned mainly with methods for discovering an underlying metric space that agrees with a given set of observations. Specifically, the problem of learning the distance function is cast as an optimization problem, where the objective function quantifies the extend to which the solution satisfies the input constraints. A common thread in many works on metric learning is the use of methods and ideas from computational geometry and the theory of metric embeddings. Over the past few decades, these ideas have also been the subject of study within the theoretical computer science community. However, there are several well-studied problems in metric learning that have not received much attention in the algorithms and computational geometry communities. Moreover, there are many metric embedding tools and techniques, that where developed within the algorithms community, that have not yet been used in the context of metric learning. The project aims at bridging this gap by a systematic study of algorithmic metric learning problems from the point of view of computational geometry and approximation algorithms. The major goals of this effort are transferring algorithmic tools to the metric learning framework, as well as providing theoretical justification for metric learning methods that are successful in practice but currently lack provable guarantees.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Asudeh, Abolfazl and Berger-Wolf, Tanya and DasGupta, Bhaskar and Sidiropoulos, Anastasios "Maximizing Coverage While Ensuring Fairness: A Tale of Conflicting Objectives" Algorithmica , 2022 https://doi.org/10.1007/s00453-022-01072-1 Citation Details
Axelrod, Brian and Diakonikolas, Ilias and Stewart, Alistair and Sidiropoulos, Anastasios and Valiant, Gregory "A Polynomial Time Algorithm for Log-Concave Maximum Likelihood via Locally Exponential Families" Advances in Neural Information Processing Systems 32 (NIPS 2019) , 2019 Citation Details
Carpenter, Timothy and Fomin, Fedor V. and Lokshtanov, Daniel and Saurabh, Saket and Sidiropoulos, Anastasios "Algorithms for Low-Distortion Embeddings into Arbitrary 1-Dimensional Spaces" 34th International Symposium on Computational Geometry (SoCG 2018) , v.99 , 2018 978-3-95977-066-8 Citation Details
Chubarian, Karine and Khan, Abdul Rafae and Sidiropoulos, Anastasios and Xu, Jia "Grouping Words with Semantic Diversity" Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2021 https://doi.org/10.18653/v1/2021.naacl-main.257 Citation Details
Evan McCarty and Qi Zhao and Anastasios Sidiropoulos and Yusu Wang "NN-Baker: A Neural-network Infused Algorithmic Framework for Optimization Problems on Geometric Intersection Graphs" Advances in neural information processing systems , 2022 Citation Details
Ihara, Diego and Mohammadi, Neshat and Sidiropoulos, Anastasios "Algorithms for Metric Learning via Contrastive Embeddings" Leibniz international proceedings in informatics , v.129 , 2019 Citation Details
Kawarabayashi, Ken-Ichi and Sidiropoulos, Anastasios "Embeddings of Planar Quasimetrics into Directed 1 and Polylogarithmic Approximation for Directed Sparsest-Cut" 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS) , 2022 https://doi.org/10.1109/FOCS52979.2021.00055 Citation Details
Salmasi, Ario and Sidiropoulos, Anastasios and Sridhar, Vijay "On Constant Multi-Commodity Flow-Cut Gaps for Families of Directed Minor-Free Graphs" Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms , 2019 10.1137/1.9781611975482.34 Citation Details
Sidiropoulos, Anastasios and Badoiu, Mihai and Dhamdhere, Kedar and Gupta, Anupam and Indyk, Piotr and Rabinovich, Yuri and Racke, Harald and Ravi, R. "Approximation Algorithms for Low-Distortion Embeddings into Low-Dimensional Spaces" SIAM Journal on Discrete Mathematics , v.33 , 2019 10.1137/17M1113527 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

In today’s rapidly advancing world, the progress of science and engineering is increasingly dependent on our ability to efficiently process and comprehend large and complex data sets. At the heart of this challenge lies the power of geometry, which offers a unified language for making sense of such data. By transforming collections of interrelated objects—whether they be users in a social network, patient records in healthcare systems, or financial transactions—into points within a metric space, we can begin to unravel the intricate patterns and relationships hidden within.

This geometric approach is not just theoretical; it forms a cornerstone of modern machine learning and data science. The effectiveness of these methods hinges on how accurately and expressively we can map general data into geometric spaces. Enter the realm of metric learning, a field dedicated to developing methods that learn these geometric representations from examples. By doing so, it allows the representations to be tailored specifically to the problem domain, enhancing our ability to extract meaningful insights.

Our project has delved into the mathematical bedrock of computational problems in metric learning, unveiling new algorithms that learn geometric representations of data sets with nearly-optimal accuracy. These advancements pave the way for more refined methods across various domains that rely on geometric data representations, offering stronger guarantees of their performance and reliability.

Moreover, our work has catalyzed the development of novel deep learning methods that leverage the unique geometry of input data sets, enabling more nuanced and effective data analysis. Beyond this, we have pioneered new techniques for assessing the robustness of popular data analytic methods when applied to geometric data, ensuring that they can withstand the rigors of real-world application.

 


Last Modified: 02/11/2024
Modified by: Anastasios Sidiropoulos

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page