Award Abstract # 1661375
Collaborative Research: ABI Innovation: A Scalable Framework for Visual Exploration and Hypotheses Extraction of Phenomics Data using Topological Analytics

NSF Org: DBI
Division of Biological Infrastructure
Recipient: UNIVERSITY OF UTAH
Initial Amendment Date: July 6, 2017
Latest Amendment Date: July 6, 2017
Award Number: 1661375
Award Instrument: Standard Grant
Program Manager: Peter McCartney
DBI
 Division of Biological Infrastructure
BIO
 Directorate for Biological Sciences
Start Date: August 1, 2017
End Date: July 31, 2021 (Estimated)
Total Intended Award Amount: $288,131.00
Total Awarded Amount to Date: $288,131.00
Funds Obligated to Date: FY 2017 = $288,131.00
History of Investigator:
  • Bei Phillips (Principal Investigator)
    beiwang@sci.utah.edu
Recipient Sponsored Research Office: University of Utah
201 PRESIDENTS CIR
SALT LAKE CITY
UT  US  84112-9049
(801)581-6903
Sponsor Congressional District: 01
Primary Place of Performance: University of Utah
75 South 2000 East
Salt Lake City
UT  US  84112-8930
Primary Place of Performance
Congressional District:
01
Unique Entity Identifier (UEI): LL8GLEVH6MG3
Parent UEI:
NSF Program(s): ADVANCES IN BIO INFORMATICS
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1329
Program Element Code(s): 116500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.074

ABSTRACT

Understanding how gene by environment interactions result in specific phenotypes is a core goal of modern biology and has real-world impacts on such things as crop management. Developing and managing successful crop practices is a goal that is fundamentally tied to our national food security. By applying novel computational visual analytical methods, this project seeks to identify and unravel the complex web of interactions linking genotypes, environments and phenotypes. These methods will first need to be designed and developed into usable software applications that can handle large volumes of crop phenomics data. High-throughput sensing technologies collect large volumes of field data for many plant traits, such as flowering time, related to crop development and production. The maize cultivars used here come from multiple genotypes that have been grown under a variety of environmental conditions, in order to give the widest range of conditions for understanding the interactions. The resulting data sets are growing quickly, both in size and complexity, but the analytical tools needed to extract knowledge and catalyze scientific discoveries have significantly lagged behind. The methodologies to be developed in this project represent a systematic attempt at bridging this rapidly widening divide. The project is inherently interdisciplinary, involving close research partnerships among computer scientists, plant scientists, and mathematicians. The research outcomes will be tightly integrated with education using a multipronged approach that includes, among others, postdoctoral and student training (graduates and undergraduates), curriculum development for a new campus-wide interdisciplinary undergraduate degree in Data Analytics, conference tutorials for training phenomics data practitioners, and contribution to the recruitment and retention of underrepresented minorities (particularly women) in STEM fields through the Pacific Northwest Louis Stokes Alliance for Minority Participation.


This project will lead to the design and development of a new, scalable, visual analytics platform suitable for hypothesis extraction and refinement from complex phenomics data sets. Focus on hypothesis extraction is critical in the context of phenomics data sets because much of the high-throughput sensing data being generated in crop fields are generated in the absence of specifically formulated hypotheses. Extracting plausible hypotheses from the data represents an important but tedious task. To this end, this project will apply and develop new capabilities using emerging advanced algorithmic principles, particularly from the branch of mathematics called algebraic topology that studies shapes and structure of complex data. The research objectives are three-fold. First, the project will employ and extend emerging algorithmic techniques from algebraic topology to decode the structure of large, complex phenomics data. Second, an interactive visual analytic platform will be developed to facilitate knowledge discovery using the extracted topological structures. Lastly, the quality and validity of a new visual analytic platform designed by this team will be tested using real-world maize data sets as well as simulated inputs as testbeds. The developed framework will encode functions for scientists to delineate hypotheses of three kinds: i) genetic characterization of single complex traits; ii) genetic characterization of multiple traits that share potentially pleiotropic effects; and iii) decoding and detailed characterization of genotype-by-environmental interactions, in particular, through a collaborative pilot study of maize flowering and growth traits. The expected significance of the proposed work is that biologists will be able to extract different types of testable hypotheses from plant phenomics data sets by employing a new class of visual analytic tools, and thus obtain a deeper understanding of the interactions among genotypes, environments and phenotypes. The project is potentially transformative in two ways: i) it will introduce advanced mathematical and computational principles into mainstream phenomic data analysis; and ii) it will usher in a new era where biologists spearhead data-driven hypothesis extraction and discovery with the aid of interactive, informative, and intuitive tools. The project will have a direct impact on the state of software in phenomics for fundamental data-driven discovery. To facilitate broader community adoption, the project will integrate the tools into the CyVerse Institute, and to a community phenomics software outlet. It will also lead to the development of automated scientific workflows. Project website: http://tdaphenomics.eecs.wsu.edu/

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 20)
Adamaszek, Michal and Adams, Henry and Gasparovic, Ellen and Gommel, Maria and Purvine, Emilie and Sazdanovic, Radmila and Wang, Bei and Wang, Yusu and Ziegelmeier, Lori "Vietoris-Rips and Cech Complexes of Metric Gluings" 34th International Symposium on Computational Geometry (SoCG 2018) , 2018 10.4230/LIPIcs.SoCG.2018.3 Citation Details
Adamaszek, Micha and Adams, Henry and Gasparovic, Ellen and Gommel, Maria and Purvine, Emilie and Sazdanovic, Radmila and Wang, Bei and Wang, Yusu and Ziegelmeier, Lori "On homotopy types of VietorisRips complexes of metric gluings" Journal of Applied and Computational Topology , v.4 , 2020 https://doi.org/10.1007/s41468-020-00054-y Citation Details
Athawale, Tushar and Maljovec, Dan and Yan, Lin and Johnson, Christopher and Pascucci, Valerio and Wang, Bei "Uncertainty Visualization of 2D Morse Complex Ensembles Using Statistical Summary Maps" IEEE Transactions on Visualization and Computer Graphics , 2020 https://doi.org/10.1109/TVCG.2020.3022359 Citation Details
Berg, Jordan_A and Zhou, Youjia and Ouyang, Yeyun and Cluntun, Ahmad_A and Waller, T_Cameron and Conway, Megan_E and Nowinski, Sara_M and Van_Ry, Tyler and George, Ian and Cox, James_E and Wang, Bei and Rutter, Jared "Metaboverse enables automated discovery and visualization of diverse metabolic regulatory patterns" Nature Cell Biology , v.25 , 2023 https://doi.org/10.1038/s41556-023-01117-9 Citation Details
Brown, Adam and Bobrowski, Omer and Munch, Elizabeth and Wang, Bei "Probabilistic convergence and stability of random mapper graphs" Journal of Applied and Computational Topology , v.5 , 2021 https://doi.org/10.1007/s41468-020-00063-x Citation Details
Brown, Adam and Wang, Bei "Sheaf-Theoretic Stratification Learning" 34th International Symposium on Computational Geometry (SoCG 2018) , 2018 10.4230/LIPIcs.SoCG.2018.14 Citation Details
Brown, Adam and Wang, Bei "Sheaf-Theoretic Stratification Learning from Geometric and Topological Perspectives" Discrete & Computational Geometry , 2020 10.1007/s00454-020-00206-y Citation Details
Bujack, Roxana and Yan, Lin and Hotz, Ingrid and Garth, Christoph and Wang, Bei "State of the Art in TimeDependent Flow Topology: Interpreting Physical Meaningfulness Through Mathematical Properties" Computer Graphics Forum , v.39 , 2020 https://doi.org/10.1111/cgf.14037 Citation Details
Chalapathi, Nithin and Zhou, Youjia and Wang, Bei "Adaptive Covers for Mapper Graphs Using Information Criteria" IEEE International Conference on Big Data , 2021 https://doi.org/10.1109/BigData52589.2021.9671324 Citation Details
Corbet, René and Fugacci, Ulderico and Kerber, Michael and Landi, Claudia and Wang, Bei "A kernel for multi-parameter persistent homology" Computers & Graphics: X , v.2 , 2019 10.1016/j.cagx.2019.100005 Citation Details
Gasparovic, Ellen and Gommel, Maria and Purvine, Emilie and Sazdanovic, Radmila and Wang, Bei and Wang, Yusu and Ziegelmeier, Lori "The relationship between the intrinsic Cech and persistence distortion distances for metric graphs" Journal of computational geometry , v.10 , 2019 10.20382/jocg.v10i1a16 Citation Details
(Showing: 1 - 10 of 20)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The project represents the first concrete demonstration of topological data analysis for plant phenomics data. Topological data analysis is one of the emerging mathematical principles with a wide range of real-world applications. The project was able to successfully demonstrate how topological data analysis can be used to analyze plant phenomics data sets and help in extracting different types of hypotheses relating to how different genotypes (crop varieties) interact with various environmental variables (e.g., temperature, humidity) to effect certain key phenotypic traits (e.g., plant height, growth rate). The project also demonstrated, through application on real-world data sets, that this interaction is not all the same, and that there is tremendous diversity in the way to different genotypes interact with different environmental variables.

From a computational standpoint, the project's developments contributed to the mathematical and algorithmic foundations in topological data analysis, including in data modeling, feature extraction, hypothesis formulation, and interactive visualization. It created multiple open source software toolkits for complex multi-dimensional data sets that have become a feature in multiple data-driven domains. In particular, this project led to the design and development of a scalable, visual analytics platform suitable for the interactive exploration of  phenomics data sets. 

The project led to the training of multiple graduate students on various interdisciplinary topics at the intersection of computer science, mathematics, biology and life sciences. 

 


Last Modified: 10/29/2021
Modified by: Bei W Phillips

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page