
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | May 24, 2018 |
Latest Amendment Date: | May 24, 2018 |
Award Number: | 1815145 |
Award Instrument: | Standard Grant |
Program Manager: |
Peter Brass
pbrass@nsf.gov (703)292-2182 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | June 1, 2018 |
End Date: | May 31, 2023 (Estimated) |
Total Intended Award Amount: | $399,899.00 |
Total Awarded Amount to Date: | $399,899.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
809 S MARSHFIELD AVE M/C 551 CHICAGO IL US 60612-4305 (312)996-2862 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
IL US 60612-4305 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Algorithmic Foundations |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The analysis of large and complicated data sets is a task of increasing importance in several brunches of science and engineering. In many application scenarios, the data consists of a set of objects endowed with some information that quantifies similarity or dissimilarity of certain pairs. Examples of such inputs include statistical distributions, user preferences in social networks, DNA sequences, and so on. A natural way to interpret such data sets is as metric spaces. That is, each object is treated as a point, and the dissimilarity between two objects is encoded by their distance. The successful application of this data-analytic framework requires finding a distance function that faithfully encodes the ground truth. The project seeks to develop new algorithms for computing such faithful metrical representations of data sets. The underlying research problems lead to new connections between the areas of computational geometry, algorithm design, and machine learning. The project aims at transferring ideas between these scientific communities, and developing new methods for data analysis in practice. The project will be part of the investigator's curriculum development, teaching, and educational and outreach activities. The main research problems that the project seeks to study will form the basis for the training of both undergraduate and graduate students. The investigator is committed to working with minorities and underrepresented groups.
This metrical interpretation of data sets has been successful in practice and has lead to a plethora of algorithmic methods and ideas, such as clustering, dimensionality reduction, nearest-neighbor search, and so on. The area of metric learning is concerned mainly with methods for discovering an underlying metric space that agrees with a given set of observations. Specifically, the problem of learning the distance function is cast as an optimization problem, where the objective function quantifies the extend to which the solution satisfies the input constraints. A common thread in many works on metric learning is the use of methods and ideas from computational geometry and the theory of metric embeddings. Over the past few decades, these ideas have also been the subject of study within the theoretical computer science community. However, there are several well-studied problems in metric learning that have not received much attention in the algorithms and computational geometry communities. Moreover, there are many metric embedding tools and techniques, that where developed within the algorithms community, that have not yet been used in the context of metric learning. The project aims at bridging this gap by a systematic study of algorithmic metric learning problems from the point of view of computational geometry and approximation algorithms. The major goals of this effort are transferring algorithmic tools to the metric learning framework, as well as providing theoretical justification for metric learning methods that are successful in practice but currently lack provable guarantees.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
In today’s rapidly advancing world, the progress of science and engineering is increasingly dependent on our ability to efficiently process and comprehend large and complex data sets. At the heart of this challenge lies the power of geometry, which offers a unified language for making sense of such data. By transforming collections of interrelated objects—whether they be users in a social network, patient records in healthcare systems, or financial transactions—into points within a metric space, we can begin to unravel the intricate patterns and relationships hidden within.
This geometric approach is not just theoretical; it forms a cornerstone of modern machine learning and data science. The effectiveness of these methods hinges on how accurately and expressively we can map general data into geometric spaces. Enter the realm of metric learning, a field dedicated to developing methods that learn these geometric representations from examples. By doing so, it allows the representations to be tailored specifically to the problem domain, enhancing our ability to extract meaningful insights.
Our project has delved into the mathematical bedrock of computational problems in metric learning, unveiling new algorithms that learn geometric representations of data sets with nearly-optimal accuracy. These advancements pave the way for more refined methods across various domains that rely on geometric data representations, offering stronger guarantees of their performance and reliability.
Moreover, our work has catalyzed the development of novel deep learning methods that leverage the unique geometry of input data sets, enabling more nuanced and effective data analysis. Beyond this, we have pioneered new techniques for assessing the robustness of popular data analytic methods when applied to geometric data, ensuring that they can withstand the rigors of real-world application.
Last Modified: 02/11/2024
Modified by: Anastasios Sidiropoulos
Please report errors in award information by writing to: awardsearch@nsf.gov.