
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | September 3, 2015 |
Latest Amendment Date: | July 15, 2019 |
Award Number: | 1546500 |
Award Instrument: | Standard Grant |
Program Manager: |
Sylvia Spengler
sspengle@nsf.gov (703)292-7347 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2015 |
End Date: | August 31, 2021 (Estimated) |
Total Intended Award Amount: | $394,518.00 |
Total Awarded Amount to Date: | $402,518.00 |
Funds Obligated to Date: |
FY 2019 = $8,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
6045 S KENWOOD AVE CHICAGO IL US 60637-2803 (773)834-0409 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
IL US 60637-2902 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Big Data Science &Engineering |
Primary Program Source: |
01001920DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Unsupervised learning of useful features, or representations, is one of the most basic challenges of machine learning. Unsupervised representation learning techniques capitalize on unlabeled data which is often cheap and abundant and sometimes virtually unlimited. The goal of these ubiquitous techniques is to learn a representation that reveals intrinsic low-dimensional structure in data, disentangles underlying factors of variation by incorporating universal AI priors such as smoothness and sparsity, and is useful across multiple tasks and domains.
This project aims to develop new theory and methods for representation learning that can easily scale to large datasets. In particular, this project is concerned with methods for large-scale unsupervised feature learning, including Principal Component Analysis (PCA) and Partial Least Squares (PLS). To capitalize on massive amounts of unlabeled data, this project will develop appropriate computational approaches and study them in the ?data laden? regime. Therefore, instead of viewing representation learning as dimensionality reduction techniques and focusing on an empirical objective on finite data, these methods are studied with the goal of optimizing a population objective based on sample. This view suggests using Stochastic Approximation approaches, such as Stochastic Gradient Descent (SGD) and Stochastic Mirror Descent, that are incremental in nature and process each new sample with a computationally cheap update. Furthermore, this view enables a rigorous analysis of benefits of stochastic approximation algorithms over traditional finite-data methods. The project aims to develop stochastic approximation approaches to PCA and PLS and related problems and extensions, including deep, and sparse variants, and analyze these problems in the data-laden regime.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The primary objective of the project was developing modern large scale methods for finding relevant dimensions, directions to subspaces in data.
The first part of the project concerned development of stochastic and distributed methods for handling large scale linear dimensionality reduction methods, both based on single-view data (as in PCA) and based on multi-view data (as in CCA and PLS). Finding the most important linear directions in the data, or the directions that are most conserved in multiple views, has been a basic building block in statistics, data analysis and machine learning, for decades. Multi-view methods in particular are important in leveraging the relationship between different modalities, such as audio and visual queues, or the relationship between multiple related tasks. We develop methods that allow doing so on a much larger scale where traditional linear-algebraic methods cannot go. These methods employ recent developments on stochastic optimization, that is working on only samples from the data at each step instead of the entire dataset, which might be too large to handle in its entirety. The methods also allow using multiple computers in a distributed fashion, which is often necessary for handling massive data sets. Due to the importance of multi-view dimensionality reduction techniques and the increase in data set sizes in the past decade, these advances can have broad impact across scientific and engineering applications.
In the second part of the project we went beyond linear models and investigated generalizations and extensions to dimensionality reduction, and more generally to the notion of “dimension”, and to learning low dimensional representations. We investigated theoretical notions of generalized dimensions, and deep learning approaches to learning low dimensional representations, as well as other notions of finding relevant directions. In particular, we consider the important problem of generalizing from a few environments to other very different environments (e.g. generalizing a pedestrian detection system trained on data from a few cities with certain cameras, to other cities with different visual characteristics and for images captured by cameras with different optics). A possible approach to such generalization is by learning relevant dimensions in the data that are invariant across environments. Our research elucidates the form of invariants that can and cannot be captured using such methods.
Related to the main theme of the project, we also investigated several other directions that emerged during the course of the research: (1) learning predictors that are robust to malicious perturbations in the input; and (2) methods for obtaining valid statistical answers to arbitrarily adaptive statistical queries.
Beyond its direct impact through research, multiple graduate students and post-doctoral researchers received training and mentorship through the project, including a female post-doctoral researcher. The two post-doctoral researchers involved in the project are now working in top technology companies (Google and Microsoft), helping bring research ideas to practice.
Last Modified: 05/31/2022
Modified by: Nathan Srebro
Please report errors in award information by writing to: awardsearch@nsf.gov.