Award Abstract # 1117965
III: Small: Collaborative Research: A Large-Scale Data Mining Framework for Genome-Wide Mapping of Multi-Modal Phenotypic Biomarkers and Outcome Prediction

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF TEXAS AT ARLINGTON
Initial Amendment Date: July 6, 2011
Latest Amendment Date: April 5, 2012
Award Number: 1117965
Award Instrument: Standard Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2011
End Date: July 31, 2016 (Estimated)
Total Intended Award Amount: $299,904.00
Total Awarded Amount to Date: $315,904.00
Funds Obligated to Date: FY 2011 = $299,904.00
FY 2012 = $16,000.00
History of Investigator:
  • Heng Huang (Principal Investigator)
    heng@umd.edu
  • Fillia Makedon (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Texas at Arlington
701 S NEDDERMAN DR
ARLINGTON
TX  US  76019-9800
(817)272-2105
Sponsor Congressional District: 25
Primary Place of Performance: University of Texas at Arlington
701 S NEDDERMAN DR
ARLINGTON
TX  US  76019-9800
Primary Place of Performance
Congressional District:
25
Unique Entity Identifier (UEI): LMLUKUPJJ9N3
Parent UEI:
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01001112DB NSF RESEARCH & RELATED ACTIVIT
01001213DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923, 9251
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Today's massive generation of digital data is greatly outpacing the development of computational methods and tools and presents critical challenges for achieving the full transformative potential of these data. For example, recent advances in acquiring multi-modal brain imaging and genome-wide array data provide exciting new opportunities to study the influence of genetic variation on brain structure and function. Major computational challenges are, however, bottlenecks for comprehensive joint analysis of these data due to their unprecedented scale and complexity. This project will employ the new capabilities of large-scale data mining techniques in multi-view learning, multi-task learning, and robust classification to address critical challenges in systematically analyzing massive multi-modal genetic, imaging, and other biomarker data. Specifically, this project will: (1) develop new multi-view learning methods to detect task-relevant phenotypic biomarkers from large scale heterogeneous imaging and other biomarker data, (2) implement new sparse multi-task regression models to reveal the genetic basis of phenotypic biomarkers at multiple levels (e.g., SNP, haplotype, gene and/or pathway), (3) design novel robust classification methods via structural sparsity for outcome prediction using integrated genotypic and phenotypic data, and (4) package these new methods into a data mining toolkit and release it to the public.

The intellectual merits of this project derive not only from the development of novel data mining methods, but also from their application to imaging genetic studies. These methods are designed to take into account interrelated structures among multiple data modalities and offer systematic strategies to reveal structural imaging genetic associations. The proposed methods and tools are expected to impact neurological and psychological research and enable investigators to effectively test imaging genetics hypothesis and advance biomedical science and technology. In addition, the proposed data mining framework addresses generic critical needs of large-scale data analysis and integration and, therefore, will impact a large number of research areas where high-value knowledge and complex patterns can potentially be discovered from massive high-dimensional and heterogeneous data sets. This project will facilitate the development of novel educational tools to enhance several current courses at UT Arlington and IUPUI. Both universities are minority-serving institutions, and the PIs will engage the minority students and under-served populations in research activities to give them a better exposure to cutting-edge scientific research.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 35)
Deguang Kong, Chris Ding, Heng Huang, Haifeng Zhao "Multi-Label ReliefF and F-Statistic Feature Selections for Image Annotation" The 25th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012) , 2012 , p.2353-2359
De Wang, Feiping Nie, Heng Huang "Fast Robust Non-negative Matrix Factorization for Large-Scale Data Clustering" 25th International Joint Conference on Artificial Intelligence (IJCAI 2016) , 2016
Feiping Nie, Heng Huang "Subspace Clustering via New Discrete Group Structure Constrained Low-Rank Model" 25th International Joint Conference on Artificial Intelligence (IJCAI 2016) , 2016
Feiping Nie, Hua Wang, Cheng Deng, Xinbo Gao, Xuelong Li, Heng Huang "New L1-Norm Relaxations and Optimizations for Graph Clustering" Thirtieth AAAI Conference on Artificial Intelligence (AAAI 2016) , 2016
Feiping Nie, Xiaoqian Wang, Michael I. Jordan, Heng Huang "The Constrained Laplacian Rank Algorithm for Graph-Based Clustering" Thirtieth AAAI Conference on Artificial Intelligence (AAAI 2016) , 2016
Hua Wang, Cheng Deng, Hao Zhang, Xinbo Gao, Heng Huang "Learning Biological Relevance of Drosophila Embryos for Drosophila Gene Expression Pattern Annotations" Thirtieth AAAI Conference on Artificial Intelligence (AAAI 2016) , 2016
Hua Wang, Feiping Nie, Heng Huang "Large-Scale Cross-Language Web Page Classification via Dual Knowledge Transfer Using Fast Nonnegative Matrix Tri-Factorization" ACM Transactions on Knowledge Discovery from Data (TKDD) , v.10 , 2015 , p.1
Hua Wang, Feiping Nie, Heng Huang "Robust and Discriminative Distance for Multi-Instance Learning" The 25th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012) , 2012 , p.2919-2924
Hua Wang, Heng Huang, Chris Ding "Correlated Protein Function Prediction via Maximization of Data Knowledge Consistency" Journal of Computational Biology (JCB) , v.22 , 2015 , p.546
Hua Wang, Heng Huang, Chris Ding "Function-Function Correlated Multi-Label Protein Function Prediction over Interaction Networks" Journal of Computational Biology , v.n/a , 2013 , p.n/a
Hua Wang, Heng Huang, Chris Ding, Feiping Nie "Predicting Protein-Protein Interactions from Multimodal Biological Data Sources via Nonnegative Matrix Factorization" Journal of Computational Biology , v.n/a , 2013 , p.n/a
(Showing: 1 - 10 of 35)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The investigation of this project produces several important outcomes.

1. We developed multiple sparse multi-view learning algorithms for identifying imaging and fluid biomarkers related to cognitive and diagnostic outcomes.

2. We developed several sparse multi-task regression and correlation algorithms for identifying genetic variants related to imaging and other phenotypic outcomes.

3. We developed a sparse multimodal multitask learning algorithm for outcome prediction via integrating imaging and genetics data.

4. We released a sparse learning software tool.

We published over 20 full-length papers related to this project in peer-reviewed conference proceedings and journals.

This project supported three Ph.D. students (one of them is female) at University of Texas at Arlington. Two of them have graduated and one of them becomes a tenure-track assistant professor in Colorado School of Mines. The third one (female) is currently a fourth year Ph.D. student in the Computer Science and Engineering department, and will graduate next year with looking for an academic position.

This project also supported two male undergraduate REU students.

The research materials produced in this project are used in teaching several graduate courses at University of Texas at Arlington.

We (both UTA and IU sites) co-organized several workshops and one special session in related fields:  (1) two MICCAI Workshops on Multimodal Brain Image Analysis (MBIA 2012 and MBIA 2013), (2) one MICCAI Workshop on Imaging Genetics (MICGen 2015), and (3) one Special Session on Neuroimaging Data Analysis and Applications at International Conference on Brain Informatics & Health (BIH 2015).


Last Modified: 11/30/2016
Modified by: Heng Huang

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page