Award Abstract # 1933521
CAREER: Evolutionary Genomics of Enzymes for Complex Carbohydrate Metabolism

NSF Org: DBI
Division of Biological Infrastructure
Recipient: BOARD OF REGENTS OF THE UNIVERSITY OF NEBRASKA
Initial Amendment Date: May 9, 2019
Latest Amendment Date: June 25, 2021
Award Number: 1933521
Award Instrument: Continuing Grant
Program Manager: Reed Beaman
DBI
 Division of Biological Infrastructure
BIO
 Directorate for Biological Sciences
Start Date: April 1, 2019
End Date: June 30, 2023 (Estimated)
Total Intended Award Amount: $656,429.00
Total Awarded Amount to Date: $656,429.00
Funds Obligated to Date: FY 2018 = $176,711.00
FY 2019 = $176,468.00

FY 2020 = $149,944.00

FY 2021 = $153,306.00
History of Investigator:
  • Yanbin Yin (Principal Investigator)
    yyin@unl.edu
Recipient Sponsored Research Office: University of Nebraska-Lincoln
2200 VINE ST # 830861
LINCOLN
NE  US  68503-2427
(402)472-3171
Sponsor Congressional District: 01
Primary Place of Performance: University of Nebraska-Lincoln
2200 Vine Street
Lincoln
NE  US  68583-0861
Primary Place of Performance
Congressional District:
01
Unique Entity Identifier (UEI): HTQ6K6NJFHA6
Parent UEI:
NSF Program(s): ADVANCES IN BIO INFORMATICS
Primary Program Source: 01001819DB NSF RESEARCH & RELATED ACTIVIT
01001920DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT

01002122DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 1165
Program Element Code(s): 116500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.074

ABSTRACT

This project intends to study the core enzymes that drive the production and breakdown of carbohydrates. These enzymes, called the carbohydrate active enzymes (CAZymes), are found in all living organisms and particularly in plants and plant-associated microbes. The complex carbohydrates found in plant cell walls are the most abundant, and renewable, organic material on Earth. If we had efficient systems to convert them to biomaterials and biofuels they would be attractive targets for bio-manufacturing projects. Important effects in the natural world are (i) the CAZymes produced by plant microbial pathogens cause plant cell wall breakdown leading to devastating crop loss ($5 billion in the United States and Canada each year) and (ii) bacteria in animal guts produce hundreds of CAZymes that digest the carbohydrates in the diet, some of which may have positive, and others toxic, consequences to the host. The research approach combines genomics and bioinformatics: the genome of a green algae will be sequenced and then bioinformatics tools will be used to carry out data analysis. This green algae is the common ancestor of all land plants, its genome compared to those of plants will show how evolution has modified core carbohydrate chemistry to meet changing environmental challenges. Bioengineering of these enzymes may well contribute to the development of a more sustainable and secure bioeconomy (e.g., bioenergy and agricultural industries) in the US, as part of the global Genomics market, whose value is expected to reach $20 billion by 2020. Students trained in the course of this project will be poised to become the next generation of scientists, able to exploit their understanding of comparative genome sequence analysis to create new understanding and novel applications. The educational and outreach objectives of this project are to engage students as active participants in the research activities, including data analysis, and to to train undergraduate students and K-12 Science teachers to understand the basics of genome sequencing and comparison methods, including hands-on skills.


In the first Aim, new bioinformatics programs will be developed to allow in-depth CAZyme annotation with predicted biochemical activities. In the second Aim, the genomic context of CAZymes will be studied in microbial genomes and metagenomes of various ecological environments. Overall four computational tools will be developed, integrated, and delivered as a CAZyme bioinformatics web portal named dbCAN2. These free online tools will facilitate CAZyme research in various research fields such as genomics, carbohydrate, bioenergy, plant disease, food security, human gut microbiome, evolution and ecology. In the third Aim, this project will sequence and mine the genomes and transcriptomes of algae and early plants for CAZymes. This includes sequencing the genome and transcriptome of a green alga Zygnema circumcarinatum, the immediate ancestor of all land plants that is extremely critical for understanding the early evolution of carbohydrate-rich cell walls. The specific education activities include: (i) working with the Office of Student Engagement and Experiential Learning (OSEEL) of Northern Illinois University to bring undergraduate students, particularly under-represented minority students, into CAZyme bioinformatics research; (ii) collaborating with the Center for Secondary Science Teacher Education of NIU to integrate DNA sequencing and data analysis topics into the curriculum of the Teacher Licensure Program as well as the professional development programs for K-12 Science teachers; and (iii) incorporating Zygnema genome annotation as new lab components into BIOS308 (Genetics) and BIOS441 (Practical Bioinformatics). Research products of this project will be disseminated at: http://cys.bios.niu.edu/dbCAN2/.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 19)
Ausland, Catherine and Zheng, Jinfang and Yi, Haidong and Yang, Bowen and Li, Tang and Feng, Xuehuan and Zheng, Bo and Yin, Yanbin "dbCAN-PUL: a database of experimentally characterized CAZyme gene clusters and their substrates" Nucleic Acids Research , v.49 , 2020 https://doi.org/10.1093/nar/gkaa742 Citation Details
Becker, Burkhard and Feng, Xuehuan and Yin, Yanbin and Holzinger, Andreas and Buschmann, Henrik "Desiccation tolerance in streptophyte algae and the algae to land plant transition: evolution of LEA and MIP protein families within the Viridiplantae" Journal of Experimental Botany , v.71 , 2020 10.1093/jxb/eraa105 Citation Details
Cao, Huansheng and Shimura, Yohei and Steffen, Morgan M. and Yang, Zhou and Lu, Jingrang and Joel, Allen and Jenkins, Landon and Kawachi, Masanobu and Yin, Yanbin and Garcia-Pichel, Ferran and Giovannoni, Stephen J. "The Trait Repertoire Enabling Cyanobacteria to Bloom Assessed through Comparative Genomic Complexity and Metatranscriptomics" mBio , v.11 , 2020 10.1128/mBio.01155-20 Citation Details
Cheng, Xin and Yang, Bowen and Zheng, Jinfang and Wei, Hongyu and Feng, Xuehuan and Yin, Yanbin "Cadmium stress triggers significant metabolic reprogramming in Enterococcus faecium CX 26" Computational and Structural Biotechnology Journal , v.19 , 2021 https://doi.org/10.1016/j.csbj.2021.10.021 Citation Details
Feng, Xuehuan and Holzinger, Andreas and Permann, Charlotte and Anderson, Dirk and Yin, Yanbin "Characterization of Two Zygnema Strains (Zygnema circumcarinatum SAG 698-1a and SAG 698-1b) and a Rapid Method to Estimate Nuclear Genome Size of Zygnematophycean Green Algae" Frontiers in Plant Science , v.12 , 2021 https://doi.org/10.3389/fpls.2021.610381 Citation Details
Fitzek E, Balazic R "Bioinformatics Analysis of Plant Cell Wall Evolution" Methods in molecular biology , 2020 https://doi.org/10.1007/978-1-0716-0621-6_27 Citation Details
Fitzek, Elisabeth and Orton, Lauren and Entwistle, Sarah and Grayburn, W. Scott and Ausland, Catherine and Duvall, Melvin R. and Yin, Yanbin "Cell Wall Enzymes in Zygnema circumcarinatum UTEX 1559 Respond to Osmotic Stress in a Plant-Like Fashion" Frontiers in Plant Science , v.10 , 2019 10.3389/fpls.2019.00732 Citation Details
Huang, Le and Yang, Bowen and Yi, Haidong and Asif, Amina and Wang, Jiawei and Lithgow, Trevor and Zhang, Han and Minhas, Fayyaz ul Amir Afsar and Yin, Yanbin "AcrDB: a database of anti-CRISPR operons in prokaryotes and viruses" Nucleic Acids Research , v.49 , 2020 https://doi.org/10.1093/nar/gkaa857 Citation Details
Li, Tang and Yin, Yanbin "Critical assessment of pan-genomic analysis of metagenome-assembled genomes" Briefings in Bioinformatics , v.23 , 2022 https://doi.org/10.1093/bib/bbac413 Citation Details
Orton, Lauren M and Fitzek, Elisabeth and Feng, Xuehuan and Grayburn, W Scott and Mower, Jeffrey P and Liu, Kan and Zhang, Chi and Duvall, Melvin R and Yin, Yanbin and Sharwood, Robert "Zygnema circumcarinatum UTEX 1559 chloroplast and mitochondrial genomes provide insight into land plant evolution" Journal of Experimental Botany , v.71 , 2020 10.1093/jxb/eraa149 Citation Details
Peterson, Daniel and Li, Tang and Calvo, Ana M. and Yin, Yanbin "Categorization of Orthologous Gene Clusters in 92 Ascomycota Genomes Reveals Functions Important for Phytopathogenicity" Journal of Fungi , v.7 , 2021 https://doi.org/10.3390/jof7050337 Citation Details
(Showing: 1 - 10 of 19)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Funded between 2017 and 2023, this NSF CAREER project is a combination of bioinformatics tool development and green algal genome sequencing. The main target of this project was the enzymes that are involved in the synthesis and degradation of complex carbohydrates in plants and microbes. Examples of these complex carbohydrates include celluloses, hemicelluloses, and pectins, which are main chemical components of plant and algal cell walls. Complex carbohydrates in cell walls are the feedstock for biofuel production, the food for humans (e.g., dietary fibers in veggies and fruits) and animals (grains and forages), the natural barriers for protecting plants from environmental stresses and pathogens, and the main carbon sinks in oceans (algal cell walls), forests (plant cell walls), and soils (fungal and bacterial cell walls).

 

Therefore, this project had two main research objectives to address the intellectual merit. The first one was to develop innovative bioinformatics tools to automate the discovery of enzymes for complex carbohydrate metabolisms. These enzymes are called carbohydrate-activie enzymes or CAZymes. The second one was to generate the draft genome of a green alga called Zygnema circumcarinatum, which is the closest relative of all land plants. This draft genome could help other scientists to understand how plants have evolved the extremely strong but delicate cell walls. These cell walls contain various complex carbohydrates tangled together and recalcitrant to enzymatic degradation, which is the reason that cell walls can protect plant cells from stresses and pathogens and at the same time the reason that lignocellulosic biofuel production is so expensive.

 

For the first main objective, the key outcome was a family of bioinformatics tools (7 computer software systems) that benefit tens of thousands of users on a daily basis from over 150 countries of all the six major continents. The core tool is called dbCAN, which is provided as a web server and a standalone software package. It is the most popular automated CAZyme discovery server in the world (300,000+ jobs processed for users in 10 years, 8,000+ email addresses, and paper cited over 3,000 times).

 

For the second main objective, the key outcome was ~2 Terabytes of DNA and RNA sequencing data of four Zygnema green algal genomes. These data have been deposited to the GenBank database of the United States National Library of Medicine and the Joint Genome Institute's algal genome database of the United States Department of Energy.

 

Lastly, we also had an education and outreach objective to address the broader of impacts. The key outcome for this objective was that we provided research training for over 50 trainnes directly or indirectly involved in this NSF project. These include 4 postdocs, 22 graduate students, 20 undergrad students, and 4 visiting scholars/students. Among these students, 22 are female, and 7 are under-represented minority students. One example is that, working with Northern Illinois University's Center for Secondary Science and Mathematics Education and the Teacher Licensure for Biology program, we had a very successful "Introduction to Genomic Data Sciences" summit in 2018. In total eight high school teachers (representing six schools) of the great Chicago area attended the half-day summit, which together impacted thousands of high school students.

 


Last Modified: 09/06/2023
Modified by: Yanbin Yin

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page