Award Abstract # 1546500
BIGDATA: Collaborative Research: F: Stochastic Approximation for Subspace and Multiview Representation Learning

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO
Initial Amendment Date: September 3, 2015
Latest Amendment Date: July 15, 2019
Award Number: 1546500
Award Instrument: Standard Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2015
End Date: August 31, 2021 (Estimated)
Total Intended Award Amount: $394,518.00
Total Awarded Amount to Date: $402,518.00
Funds Obligated to Date: FY 2015 = $394,518.00
FY 2019 = $8,000.00
History of Investigator:
  • Nathan Srebro (Principal Investigator)
    nati@ttic.edu
Recipient Sponsored Research Office: Toyota Technological Institute at Chicago
6045 S KENWOOD AVE
CHICAGO
IL  US  60637-2803
(773)834-0409
Sponsor Congressional District: 01
Primary Place of Performance: Toyota Technological Institute at Chicago
IL  US  60637-2902
Primary Place of Performance
Congressional District:
01
Unique Entity Identifier (UEI): ERBJF4DMW6G4
Parent UEI: ERBJF4DMW6G4
NSF Program(s): Big Data Science &Engineering
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
01001920DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7433, 8083, 9251
Program Element Code(s): 808300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Unsupervised learning of useful features, or representations, is one of the most basic challenges of machine learning. Unsupervised representation learning techniques capitalize on unlabeled data which is often cheap and abundant and sometimes virtually unlimited. The goal of these ubiquitous techniques is to learn a representation that reveals intrinsic low-dimensional structure in data, disentangles underlying factors of variation by incorporating universal AI priors such as smoothness and sparsity, and is useful across multiple tasks and domains.

This project aims to develop new theory and methods for representation learning that can easily scale to large datasets. In particular, this project is concerned with methods for large-scale unsupervised feature learning, including Principal Component Analysis (PCA) and Partial Least Squares (PLS). To capitalize on massive amounts of unlabeled data, this project will develop appropriate computational approaches and study them in the ?data laden? regime. Therefore, instead of viewing representation learning as dimensionality reduction techniques and focusing on an empirical objective on finite data, these methods are studied with the goal of optimizing a population objective based on sample. This view suggests using Stochastic Approximation approaches, such as Stochastic Gradient Descent (SGD) and Stochastic Mirror Descent, that are incremental in nature and process each new sample with a computationally cheap update. Furthermore, this view enables a rigorous analysis of benefits of stochastic approximation algorithms over traditional finite-data methods. The project aims to develop stochastic approximation approaches to PCA and PLS and related problems and extensions, including deep, and sparse variants, and analyze these problems in the data-laden regime.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 44)
Arora Raman and Bartlett Peter and Mianjy Poorya and Srebro Nathan "Dropout: Explicit Forms and Capacity Control" ICML , 2021 Citation Details
Arora Raman; Barlett Peter; Mianjy Poorya; Srebro Nathan "Dropout: Explicit Forms and Capacity Control" ICML , 2021
Arora Raman; Marino Vanislavov Teodor; Mianjy Poorya; Srebro Nathan "Stochastic Approximation for Canonical Correlation Analysis" Advances in Neural Information Processing Systems 30 , 2017
Blake E Woodworth, Vitaly Feldman, Saharon Rosset, Nati Srebro "The everlasting database: Statistical validity at a fair price" In Advances in Neural Information Processing Systems , 2018 , p.6531
Blake Woodworth, Nathan Srebro "Tight Complexity Bounds for Optimizing Composite Objectives" Neural Information Processing Systems (NIPS) 29 , 2016 http://arxiv.org/abs/1605.08003
Chao Gao, Dan Garber, Nathan Srebro, Jialei Wang, Weiran Wang "Stochastic Canonical Correlation Analysis." Journal of Machine Learning Research , v.20 , 2019 , p.1
Dan Garber, Ohad Shamir, Nathan Srebro "Communication-efficient algorithms for distributed stochastic principal component analysis" Proceedings of the 34th International Conference on Machine Learning , v.70 , 2017
Dan Garber, Ohad Shamir, Nathan Srebro "Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis" Proceedings of the 34th International Conference on Machine Learning (ICML) , 2017 https://arxiv.org/abs/1702.08169
Foster, Dylan; Sekhari, Ayush; Shamir, Ohad; Srebro, Nathan; Sridharan, Karthik; Woodworth, Blake "The Complexity of Making the Gradient Small in Stochastic Convex Optimization" Conference on Learning Theory (COLT) , 2019
Gao Chao; Garber Dan; Srebro Nathan; Wang Jialei; Wang Weiran "Stochastic Canonical Correlation Analysis" Journal of Machine Learning Research , v.20 , 2019
Garber Dan; Shamir Ohad; Srebro Nathan "Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis" Proceedings of the 34th International Conference on Machine Learning , v.70 , 2017
(Showing: 1 - 10 of 44)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.


The primary objective of the project was developing modern large scale methods for finding relevant dimensions, directions to subspaces in data. 

 

The first part of the project concerned development of stochastic and distributed methods for handling large scale linear dimensionality reduction methods, both based on single-view data (as in PCA) and based on multi-view data (as in CCA and PLS).  Finding the most important linear directions in the data, or the directions that are most conserved in multiple views, has been a basic building block in statistics, data analysis and machine learning, for decades.  Multi-view methods in particular are important in leveraging the relationship between different modalities, such as audio and visual queues, or the relationship between multiple related tasks.  We develop methods that allow doing so on a much larger scale where traditional linear-algebraic methods cannot go. These methods employ recent developments on stochastic optimization, that is working on only samples from the data at each step instead of the entire dataset, which might be too large to handle in its entirety.  The methods also allow using multiple computers in a distributed fashion, which is often necessary for handling massive data sets.  Due to the importance of multi-view dimensionality reduction techniques and the increase in data set sizes in the past decade, these advances can have broad impact across scientific and engineering applications.

 

In the second part of the project we went beyond linear models and investigated generalizations and extensions to dimensionality reduction, and more generally to the notion of “dimension”, and to learning low dimensional representations.   We investigated theoretical notions of generalized dimensions, and deep learning approaches to learning low dimensional representations, as well as other notions of finding relevant directions.  In particular, we consider the important problem of generalizing from a few environments to other very different environments (e.g. generalizing a pedestrian detection system trained on data from a few cities with certain cameras, to other cities with different visual characteristics and for images captured by cameras with different optics).  A possible approach to such generalization is by learning relevant dimensions in the data that are invariant across environments.  Our research elucidates the form of invariants that can and cannot be captured using such methods.

 

Related to the main theme of the project, we also investigated several other directions that emerged during the course of the research: (1) learning predictors that are robust to malicious perturbations in the input; and (2) methods for obtaining valid statistical answers to arbitrarily adaptive statistical queries.

 

Beyond its direct impact through research, multiple graduate students and post-doctoral researchers received training and mentorship through the project, including a female post-doctoral researcher.  The two post-doctoral researchers involved in the project are now working in top technology companies (Google and Microsoft), helping bring research ideas to practice.


 

 


Last Modified: 05/31/2022
Modified by: Nathan Srebro

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page