
NSF Org: |
DBI Division of Biological Infrastructure |
Recipient: |
|
Initial Amendment Date: | July 8, 2015 |
Latest Amendment Date: | July 8, 2015 |
Award Number: | 1458422 |
Award Instrument: | Standard Grant |
Program Manager: |
Peter McCartney
DBI Division of Biological Infrastructure BIO Directorate for Biological Sciences |
Start Date: | July 15, 2015 |
End Date: | June 30, 2019 (Estimated) |
Total Intended Award Amount: | $609,564.00 |
Total Awarded Amount to Date: | $609,564.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
2385 IRVING HILL RD LAWRENCE KS US 66045-7563 (785)864-3441 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
KS US 66045-7568 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | ADVANCES IN BIO INFORMATICS |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.074 |
ABSTRACT
Extracting biological knowledge from complex datasets such as those now being compiled requires integration of powerful computational tools. Recent developments in computational biology as well as rich new data sources provide novel opportunities for integrating massive amounts of biological data. This perfect storm of new data and advanced data acquisition, management, and integration afford the unique opportunity to drive the discovery of new, complex patterns in biology. The project will leverage NSF's considerable investment in biodiversity tools provided by Open Tree of Life (the framework for the project), Lifemapper (which handles geospatial data), iDigBio (data from ~1 billion museum specimens that carry locality data and their ecological information), and Arbor (computer tools that permit new analyses from the sources noted). It will create much needed computational connections among these tools. It will then build upon these new linkages and tools, enabling novel research in biodiversity. These linkages will provide researchers the opportunity to rapidly synthesize datasets and use them to address diverse evolutionary questions. The tools and infrastructure the project will build will connect species relationships with species distribution models, climate projections, genes and traits. The project will transform future studies of biodiversity; it will provide a global integration of powerful tools that will permit new data-driven discovery in "next generation" biodiversity science. It will provide interdisciplinary post-doc and graduate student training in bioinformatics, use of digitized specimen data, and complex analyses (e.g. ecological analyses), preparing the biodiversity scientists of the future. The project will recruit underrepresented students and women and develop an undergrad course that will help train students with the integrative skills (field biology to computational biology) needed in the workforce. We will further develop this module for wider classroom use. We will introduce an annual week-long course at University of Florida (UF) for students and post-docs on the use of the resources developed. With education specialists at UF, the project will produce video materials and a coordinated display for general audiences on the importance of digitized specimen data, and their utility for studies of climate change.
The project will develop a computational framework linking diverse data (trees of species relationships, morphology, ecology, fossils, geography, and climate) across research tools used by the biological community, including Open Tree of Life, which will serve as the framework to which all other biological data - traits, genes, genomes, and especially specimens - will be linked, as well as Lifemapper, iDigBio, and Arbor. Use of the large, hyper-diverse plant group Saxifragales will provide precisely what is needed to drive the development of these tools--a comprehensive dataset that covers morphology, ecology, geography, fossils, and climate provides a test case for refining the tools the project will develop and their integration. The project will: 1. Facilitate new synergistic research of broad utility at the interface of phylogenetics, ecology, evolutionary biology, biogeography and biodiversity science, enabling scientists to address novel questions relating phenotypic and ecological biodiversity, spatial and temporal variation, community assembly, and diversification across landscapes and through time. 2. Increase visibility and accessibility of iDigBio, Open Tree of Life, Arbor, and Lifemapper resources by linking them together and making them available through multiple access points (e.g., pre-existing tools associated with Arbor and Lifemapper) in a variety of appropriate formats. 3. Develop a complete, multifaceted species-level dataset for a large clade (Saxifragales), which will not only fill in this branch on the ToL, but will produce a resource of great utility for the scientific community to explore. 4. Demonstrate the utility of iDigBio, Open Tree, Lifemapper, and Arbor resources with a comprehensive analysis using near complete sampling of Saxifragales, for which we will add the following data layers: DNA sequences, morphology, fossils, ontologies, geospatial and environmental data, digitized voucher specimens, and link to the Encyclopedia of Life (EOL). The project will: 1) provide interdisciplinary post-doc and graduate student training in bioinformatics, large-scale phylogeny reconstruction, use of digitized specimen data, and complex post-tree analyses (e.g. niche modeling, niche diversification), preparing the integrative biodiversity scientists of the future; 2) recruit underrepresented students and women; 3) developed an undergrad course that uses field collection, herbarium specimens, digitized data (iDigBio), and niche modeling (with climate change; 4) introduce an annual week-long course (UF) for students and post-docs on the use of the resources produced; 5) produce video materials and a coordinated display for general audiences on the importance of digitized specimen data, their utility for modeling niche evolution through time and implications for climate change. The project will provide a platform that will enable other researchers to take the same integrated approach in other groups. It will also establish web links to EOL and 1) build species pages; 2) place morphological and other trait data on TraitBank, making these widely available; 3) work with EOL and iNaturalist to engage citizen scientists.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Biologists and computer scientists at the Universities of Florida, Kansas, and Michigan collaborated to develop BiotaPhy (“Biota-fi”), a software system for modelling and analyzing the geographical distributions of animal and plant species. BiotaPhy predictive species distribution modelling software focusses on the potential impacts of climate change on species ranges and on the diversity of native plant and animal communities. Its species distribution computations and model outputs describe how individual species are distributed in current day climate, and predict how they may respond to future climates, particularly changes in temperature and precipitation. The principal science disciplines of the project are: biogeography, biodiversity research, and phylogenetics. The data and computing infrastructure these research communities have developed for over 50 years largely consist of independent components--information systems, standards, and software methods. The BiotaPhy Project bridged across and connected the computing infrastructure of those research communities to integrate data and methods from the three fields for cross-disciplinary or "convergent" research workflows.
The BiotaPhy workflow software we implemented automates the assembly of data from millions of computerized, online records of species occurrences by extracting them from internet databases of information derived from biological specimens in natural history museums around the world. Biotaphy software merges what is known about species distributions based wild-collected specimens with climate data, to produce distribution models of where the species would be predicted to live or range in current, and in modeled future or past climates. In addition, BiotaPhy software takes the data and distribution models from individual species and assembles them into a data matrix that summarizes the geographical distribution of all the species under study for a large geographical area (e.g. plant species of North America) and puts that data into a single data structure known as an incidence or presence-absence matrix ("PAM"). PAMs are standard logical structures for data and are used widely for computation in biogeographic or macroecological research to address questions such as: What are the regional or continental patterns of species distributions? What levels of species diversity do we see in different geographical areas? Which natural processes might be responsible for the generation and maintenance of reginal or continental patterns of biological diversity? How might the species composition of natural communities change under different climate conditions? The study of patterns of biological diversity of plant and animal species also informs conservation efforts by identifying unusual or unique species assemblages (natural communities) so that they might be prioritized for genetic diversity protection.
Leveraging 300 years of investment in Earth biodiversity (species) inventory and the corresponding investment biological museums continue to make with curation of specimens and species identification, the BiotaPhy platform represents a third leg of the biodiversity research stool—a computerization "gateway" for biodiversity and phylogeographic analysis and synthesis. Community gateways (to research resources) like the BiotaPhy platform extend the monumental investments made in biological field exploration and with data acquisition on species, as a computational foundation layer upon which research innovation will be built. The study of earth's biological diversity will continue to advance along its multi-decadal scale progression from cabinets of preserved specimens, to online digital data resources, to integrative computational gateways. We envision the BiotaPhy platform as an enduring solution for deploying and democratizing the biodiversity research community’s capacity for transformative, data-intensive research. As an online, open-access research platform, BiotaPhy gives scientists and students facilitated access to significant global data and computational resources for research discovery on the evolution, ecology, and biological diversity of the nation's and the world's natural biological communities.
Last Modified: 10/16/2019
Modified by: James H Beach
Please report errors in award information by writing to: awardsearch@nsf.gov.