Award Abstract # 1942394
CAREER: Computational strategies for incompleteness and heterogeneity in multi-omic data

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: TRUSTEES OF INDIANA UNIVERSITY
Initial Amendment Date: March 17, 2020
Latest Amendment Date: May 9, 2023
Award Number: 1942394
Award Instrument: Continuing Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 1, 2020
End Date: May 31, 2026 (Estimated)
Total Intended Award Amount: $549,909.00
Total Awarded Amount to Date: $549,909.00
Funds Obligated to Date: FY 2020 = $107,089.00
FY 2021 = $108,492.00

FY 2022 = $109,938.00

FY 2023 = $224,390.00
History of Investigator:
  • Jingwen Yan (Principal Investigator)
    jingyan@iupui.edu
Recipient Sponsored Research Office: Indiana University
107 S INDIANA AVE
BLOOMINGTON
IN  US  47405-7000
(317)278-3473
Sponsor Congressional District: 09
Primary Place of Performance: Indiana University
545 W Michigan St
Indianapolis
IN  US  46202-3103
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): YH86RTW2YVJ4
Parent UEI:
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01002324DB NSF RESEARCH & RELATED ACTIVIT
01002021DB NSF RESEARCH & RELATED ACTIVIT

01002122DB NSF RESEARCH & RELATED ACTIVIT

01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 7364
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Multi-omics refers to the integrative analysis of multiple types of -omics data (e.g., genotype, gene expression and protein expression). Increasing multi-omic data provides opportunities for discovery of disease biomarkers from multiple molecular scales and therefore can further our understanding of underlying disease mechanisms. Despite this great potential, existing multi-omic data collections are mostly incomplete and of heterogeneous types (e.g., continuous and categorical numbers). Integrating these data for joint analysis typically requires exclusion of many subjects with missing values; as a consequence, a large chunk of data remains unused. This project provides novel perspectives in handling the incompleteness and heterogeneity problems in multi-omics data and hereafter allow biomedical researchers to gain more insights from rapidly growing yet imperfect biomedical data. In addition, the increasing multi-omics data has led to a massive transformation in biomedical research and has resulted in an unprecedented need in information management, decision support, and advanced analytics. In this project, a series of educational activities will be conducted to engage students at their early stages of education and to increase their awareness of educational opportunities and career paths in biomedical informatics.

This project aims to develop new classes of computational methods to enable the joint mining of incomplete and heterogeneous multi-omic data by leveraging various biological networks for discovery of functionally connected biomarkers. Towards this, two tasks will be performed: 1) identify multi-omic subnetworks as biomarkers via a multi-task joint network module detection and feature selection model, and 2) select associated features between heterogeneous -omics layers via a novel multi-task sparse association model. The first task aims to address the incomplete data problem. This new model can not only handle the incomplete data collected from one large-scale project, but also allow the joint analysis of -omics data from multiple small-scale projects without overlap in subjects. The second task addresses the heterogeneity problem with a novel two-step strategy in associating different -omics layers. Built upon these research efforts, three outreach educational activities will be conducted: 1) develop a project-based curriculum for high school students, 2) host an annual summer workshop on multi-omics for high school students, and 3) provide advanced research opportunities to undergraduates from biomedical informatics and related disciplines. This research effort will lead to discovery of more reliable biomarkers for further validation and better understanding of their relationships with disease traits than currently possible.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

He, Bing and Gorijala, Priyanka and Xie, Linhui and Cao, Sha and Yan, Jingwen "Gene co-expression changes underlying the functional connectomic alterations in Alzheimers disease" BMC Medical Genomics , v.15 , 2022 https://doi.org/10.1186/s12920-022-01244-6 Citation Details
He, Bing and Wu, Ruiming and Sangani, Neel and Pugalenthi, Pradeep_Varathan and Patania, Alice and Risacher, Shannon_L and Nho, Kwangsik and Apostolova, Liana_G and Shen, Li and Saykin, Andrew_J and Yan, Jingwen "Integrating amyloid imaging and genetics for early risk stratification of Alzheimer's disease" Alzheimer's & Dementia , v.20 , 2024 https://doi.org/10.1002/alz.14244 Citation Details
He, Bing and Xie, Linhui and Varathan, Pradeep and Nho, Kwangsik and Risacher, Shannon L. and Saykin, Andrew J. and Yan, Jingwen "Fused multi-modal similarity network as prior in guiding brain imaging genetic association" Frontiers in Big Data , v.6 , 2023 https://doi.org/10.3389/fdata.2023.1151893 Citation Details
Varathan, Pradeep and Gorijala, Priyanka and Jacobson, Tanner and Chasioti, Danai and Nho, Kwangsik and Risacher, Shannon L. and Saykin, Andrew J. and Yan, Jingwen "Integrative analysis of eQTL and GWAS summary statistics reveals transcriptomic alteration in Alzheimer brains" BMC Medical Genomics , v.15 , 2022 https://doi.org/10.1186/s12920-022-01245-5 Citation Details
Xie, Linhui and He, Bing and Varathan, Pradeep and Nho, Kwangsik and Risacher, Shannon L and Saykin, Andrew J and Salama, Paul and Yan, Jingwen "Integrative-omics for discovery of network-level disease biomarkers: a case study in Alzheimers disease" Briefings in Bioinformatics , v.22 , 2021 https://doi.org/10.1093/bib/bbab121 Citation Details
Xie, Linhui and Raj, Yash and Varathan, Pradeep and He, Bing and Yu, Meichen and Nho, Kwangsik and Salama, Paul and Saykin, Andrew J and Yan, Jingwen "Deep Trans-Omic Network Fusion for Molecular Mechanism of Alzheimers Disease" Journal of Alzheimer's Disease , v.99 , 2024 https://doi.org/10.3233/JAD-240098 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page