Award Abstract # 1808591
Collaborative Research: CDS&E: Theoretical Foundations and Algorithms for L1-Norm-Based Reliable Multi-Modal Data Analysis

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: REGENTS OF THE UNIVERSITY OF CALIFORNIA AT RIVERSIDE
Initial Amendment Date: August 27, 2018
Latest Amendment Date: August 27, 2018
Award Number: 1808591
Award Instrument: Standard Grant
Program Manager: Tevfik Kosar
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2018
End Date: August 31, 2021 (Estimated)
Total Intended Award Amount: $175,263.00
Total Awarded Amount to Date: $175,263.00
Funds Obligated to Date: FY 2018 = $175,263.00
History of Investigator:
  • Evangelos Papalexakis (Principal Investigator)
    epapalex@cs.ucr.edu
Recipient Sponsored Research Office: University of California-Riverside
200 UNIVERSTY OFC BUILDING
RIVERSIDE
CA  US  92521-0001
(951)827-5535
Sponsor Congressional District: 39
Primary Place of Performance: University of California-Riverside
CA  US  92521-0001
Primary Place of Performance
Congressional District:
39
Unique Entity Identifier (UEI): MR5QC5FCAVH5
Parent UEI:
NSF Program(s): CDS&E-MSS,
CDS&E
Primary Program Source: 01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 026Z, 8084, 9263
Program Element Code(s): 806900, 808400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

In modern applications of science and engineering, large volumes of data are collected from diverse sensor modalities, commonly stored in the form of high-order arrays (tensors), and jointly analyzed in order to extract information about underlying phenomena. This joint tensor analysis can exploit inherent dependencies across data modalities and allow for markedly enhanced inference. Standard methods for tensor analysis rely on formulations that are sensitive to heavily corrupted points among the processed data (outliers). To counteract the destructive impact of outliers in modern data analysis (and thereto relying applications), this project will investigate new theory and robust algorithmic methods. The performance benefits of the developed tools will be evaluated in applications from the fields of data analytics, machine learning and computer vision. Thus, this research aspires to increase significantly the reliability of data-enabled research across science and engineering. Combining theoretical explorations, with practical algorithmic solutions for data analysis and experimental evaluations, this project has the potential to build significant future capacity not only for U.S. academic institutions but also for the U.S. government and industry. Thus, apart from promoting the progress of science, this project could contribute to advances in the national prosperity and welfare. In addition, research activities under this project will be integrated with education. Participating students, at both graduate and undergraduate levels, will gain important experience in optimization theory, machine learning, computer vision, and data mining, among other areas. Moreover, the project plan includes multiple STEM outreach activities and supports diversity in STEM by involving?students from underrepresented groups.

In this project, the theoretical underpinnings of L1-norm tensor analysis will be investigated, with a focus on its computational hardness and exact solution. Then, based on these new foundations, efficient/practical algorithms for L1-norm tensor analysis will be explored, together with scalable and distributed software implementations. These theoretical and algorithmic investigations are expected to advance significantly the knowledge in the currently under-explored area of L1-norm tensor analysis and deliver highly impactful methodologies for outlier-resistant multimodal data processing. Next, the PIs will employ the newly developed algorithmic tools in key problems from the fields of data analytics, machine learning and computer vision. In addition, research activities under this project will be integrated with education. Participating students, at both graduate and undergraduate levels, will gain important experience in optimization theory, machine learning, computer vision, and data mining, among other areas. Moreover, the project plan includes multiple STEM outreach activities?and supports diversity in STEM by involving students from underrepresented groups.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 16)
Abdali, Sara and Gurav, Rutuja and Menon, Siddharth and Fonseca, Daniel and Entezari, Negin and Shah, Neil and Papalexakis, Evangelos E. "Identifying Misinformation from Website Screenshots" International AAAI Conference on Web and Social Media (ICWSM) 2021 , 2021 Citation Details
Abdali, Sara and Vasilescu, M. Alex and Papalexakis, Evangelos E. "Deepfake Representation with Multilinear Regression" MIS2-KDD 2021 : The Second International MIS2 Workshop: Misinformation and Misbehavior Mining on the Web-2021 , 2021 Citation Details
Chachlakis, Dimitris G. and Tsitsikas, Yorgos and Papalexakis, Evangelos E. and Markopoulos, Panos P. "Robust Multi-Relational Learning With Absolute Projection Rescal" IEEE Global Conference on Information Processing , 2019 10.1109/GlobalSIP45357.2019.8969097 Citation Details
Entezari, Negin and Al-Sayouri, Saba A. and Darvishzadeh, Amirali and Papalexakis, Evangelos E. "All You Need Is Low (Rank): Defending Against Adversarial Attacks on Graphs" Proceedings of the 13th International Conference on Web Search and Data Mining , 2020 10.1145/3336191.3371789 Citation Details
Entezari, Negin and Papalexakis, Evangelos E. and Wang, Haixun and Rao, Sharath and Prasad, Shishir Kumar "Tensor-based Complementary Product Recommendation" 2021 IEEE International Conference on Big Data (IEEE BigData 2021) , 2021 Citation Details
Gujral, Ekta and Pasricha, Ravdeep and Yang, Tianxiong and Papalexakis, Evangelos E. "OCTEN: Online Compression-Based Tensor Decomposition" 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP) , 2019 10.1109/CAMSAP45676.2019.9022641 Citation Details
Gurav, Rutuja and Barish, Barry and Papalexakis, Evangelos E. "Multilinear Factorized Representations for LIGO Glitches in Label-scarce Settings" ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2019 Workshop: Fragile Earth: Theory Guided Data Science to Enhance Scientific Discovery , 2019 Citation Details
Gurav, Rutuja and Barish, Barry and Vajente, Gabriele and Papalexakis, Evangelos E. "Unsupervised matrix and tensor factorization for LIGO glitch identification using auxiliary channels" AAI 2020 Fall Symposium on Physics-Guided AI to Accelerate Scientific Discovery , 2020 Citation Details
Izbicki, Mike and Papalexakis, Vagelis and Tsotras, Vassilis "Geolocating Tweets in any Language at any Location" Proceedings of the 28th ACM International Conference on Information and Knowledge Management , 2019 https://doi.org/10.1145/3357384.3357926 Citation Details
Izbicki M., Papalexakis E.E. "Exploiting the Earths Spherical Geometry to Geolocate Images" Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. , 2020 https://doi.org/10.1007/978-3-030-46147-8_1 Citation Details
Pasricha, Ravdeep S. and Devineni, Pravallika and Papalexakis, Evangelos E. and Kannan, Ramakrishnan "Tensorized Feature Spaces for Feature Explosion" International Conference on Pattern Recognition (ICPR) 2020 , 2021 https://doi.org/10.1109/ICPR48806.2021.9412320 Citation Details
(Showing: 1 - 10 of 16)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Tensor methods are powerful data analytic tools from multilinear algebra which are widely used in machine learning and data science. Tensors can represent heterogeneous, multi-modal datasets which come from a wide variety of real-world applications, including social network analysis, computer vision, recommendation systems, and scientific applications. The main focus of the project was to develop and apply robust tensor methods, where the data that are being analyzed are corrupted, noisy, or incomplete, which is very frequently the case in real-world scenarios.  The research outcomes of this project fall into two thrusts: (1) Fundamental algorithm development for robust tensor methods, and (2) Real-world applications. 

 

The project advanced the state-of-the-art in robust tensor methods while also innovating in key high-impact applications where the power and generality of tensor methods in obtaining state-of-the-art results was demonstrated. As part of this project, graduate students were trained in the PI’s lab, and the results of the research were broadly disseminated at top scientific venues. Furthermore, as part of fostering a vibrant research community, the PI was actively involved in the organization of symposia and workshops in the area of tensor methods for data science and machine learning.

 

Highlights for research outcomes are outlined below.

 

Fundamental algorithm development for robust tensor methods:

The first main thrust of the project was the design and development of fundamental algorithms for tensor analysis, with specific emphasis on robustness. Within that framework, the following two threads were explored:

 

(a) Robust tensor decomposition and compression: Within this thread, the project developed novel formulations for computing tensor decompositions for data analysis and compression in a manner which is less sensitive to outliers and noise, thus offering higher-quality in terms of data representation, reconstruction, and compression. Furthermore, a compression-based streaming tensor decomposition was developed.

 

(b) Tensor model order selection and its interplay with compression: Model selection in tensor analysis refers to the selection of the number of hidden patterns in the data, and is of utmost importance, since it dictates the number of insights that can be extracted from the data. The selection of this number is an extremely hard problem. Different methods for model order selection were developed, with specific emphasis on doing so in compressed data, which allows for scalable computations.

 

Real-world Applications:

The second main thrust of the work was the adaptation and application of tensor methods to high-impact real-world problems, including social media and network analysis, recommendation systems, computer vision, and scientific data analysis.

 

(a) Social media and network analysis:

WIthin the umbrella of social media and network analysis, the project focused on a number of high-impact and timely problems, such as misinformation detection (studying different aspects of it, including analyzing article content, analyzing visual cues from website screenshots, and detecting DeepFake images), geolocation of social media posts from text and images, humor recognition especially in the cases where limited human annotations are present, and the development of methods for shielding graph neural networks against adversarial attacks, where corrupted network interactions are meant to mislead the classification of certain entities in the network.

 

(b) Recommendation systems:

One of the main tensor models studied in the first thrust (RESCAL) was successfully applied to the challenging problem of complementary item recommendation in online grocery shopping.

 

(c) Computer vision:

As part of shielding neural networks that specialize in image classification, the project focused on tensor compression for alleviating adversarially corrupted images, which are meant to fool the classifier while remaining imperceptible to humans. Furthermore, the project developed tensor-based methods for computing representations of hyperspectral images which rival neural network methods in pixel classification. 

 

(d) Scientific data analysis:

Scientific data are inherently multi-dimensional and multi-modal. In this project, specific emphasis was placed in the analysis of data from the Laser Interferometer Gravitational Wave Observatory (LIGO) with tensor methods towards the detection of so-called “glitches”, noise transients that appear similar to meaningful patterns in the data, thus corrupting the scientific data and impacting the accurate detection of gravitational waves.

 


Last Modified: 12/22/2021
Modified by: Evangelos Papalexakis

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page