Award Abstract # 1350041
CAREER: Algorithms for single molecule sequence analysis

NSF Org: DBI
Division of Biological Infrastructure
Recipient: COLD SPRING HARBOR LABORATORY
Initial Amendment Date: June 7, 2014
Latest Amendment Date: May 26, 2015
Award Number: 1350041
Award Instrument: Continuing Grant
Program Manager: Jen Weller
DBI
 Division of Biological Infrastructure
BIO
 Directorate for Biological Sciences
Start Date: June 1, 2014
End Date: April 30, 2016 (Estimated)
Total Intended Award Amount: $1,534,349.00
Total Awarded Amount to Date: $599,600.00
Funds Obligated to Date: FY 2014 = $293,574.00
FY 2015 = $49,920.00
History of Investigator:
  • Michael Schatz (Principal Investigator)
    michael.schatz@gmail.com
Recipient Sponsored Research Office: Cold Spring Harbor Laboratory
1 BUNGTOWN RD
COLD SPG HBR
NY  US  11724-2202
(516)367-8307
Sponsor Congressional District: 03
Primary Place of Performance: Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor
NY  US  11724-2209
Primary Place of Performance
Congressional District:
03
Unique Entity Identifier (UEI): GV31TMFLPY88
Parent UEI:
NSF Program(s): ADVANCES IN BIO INFORMATICS
Primary Program Source: 01001415DB NSF RESEARCH & RELATED ACTIVIT
01001516DB NSF RESEARCH & RELATED ACTIVIT

01001617DB NSF RESEARCH & RELATED ACTIVIT

01001718DB NSF RESEARCH & RELATED ACTIVIT

01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 9179, 9251
Program Element Code(s): 116500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.074

ABSTRACT

The Cold Spring Harbor Laboratory is awarded a CAREER grant for the PI Michael Schatz to develop new computational methods for processing DNA sequencing data from the latest high-throughput sequencing technologies. DNA sequencing costs and throughput have improved by orders of magnitudes over the last three decades, although many questions remain unsolved, especially because of the short sequence lengths currently available. Emerging "third generation" sequencing technology from Pacific Biosciences, Moleculo, Oxford Nanopore, and other companies are poised to revolutionize genomics by enabling the sequencing of long, individual molecules of DNA and RNA. The sequence lengths with these technologies can reach up to tens of thousands of nucleotides, however few or no analysis packages are capable of dealing with these types of genetic sequence data. This project will overcome these limitations by developing several novel analysis algorithms specifically for long read single molecule sequencing and their associated complex error models. The outcomes will help answer biological questions of profound significance to all of society, such as: What were the genetic implications of the domestication of rice? What genes and regulatory elements give rise to the incredible regenerative properties of the flatworm? or, What can be understood from assembling reference genomes of sugarcane and pineapple towards breeding more robust plant crops and biofuels?

Specific objectives of the research include working towards assembling entire plant and animal chromosomes into complete, haplotype-phased sequences; identifying fusion genes and complex alternative splicing patterns responsible for diseases or adaptability; and searching for structural variations associated with improved crop yield or human diseases such as cancer or autism. Even if some future technology is capable of directly reading entire transcripts or entire genomes, this research will remain necessary to examine the higher level relationships across populations of genomes or in measuring the dynamics of gene expression and splicing.

This project will tightly integrate research and education, promoting opportunities at high school through postdoctoral levels with the development of new course materials, hands-on research opportunities, and one-on-one mentoring experiences. This effort will specifically target the intersection of computer science and biology, promoting interdisciplinary education, and ensuring the next generation of scientists are ready for the complexities of quantitative and digital biology. To engage the widest possible audience, Dr Schatz will also develop novel online teaching materials made available through a yearly bioinformatics contest. The first round of the contest reached nearly 1000 students around the world and at all levels of education, engaging students far beyond our physical limits. The products of the research will be made available as open-source software, and installed into the graphical iPlant Discovery Environment making them easily accessible to the large community of plant researcher around the world.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 32)
Aganezov, Sergey and Goodwin, Sara and Sherman, Rachel M. and Sedlazeck, Fritz J. and Arun, Gayatri and Bhatia, Sonam and Lee, Isac and Kirsche, Melanie and Wappel, Robert and Kramer, Melissa and Kostroff, Karen and Spector, David L. and Timp, Winston and "Comprehensive analysis of structural variants in breast cancer genomes using single-molecule sequencing" Genome Research , v.30 , 2020 https://doi.org/10.1101/gr.260497.119 Citation Details
Aganezov, Sergey and Yan, Stephanie M. and Soto, Daniela C. and Kirsche, Melanie and Zarate, Samantha and Avdeyev, Pavel and Taylor, Dylan J. and Shafin, Kishwar and Shumate, Alaina and Xiao, Chunlin and Wagner, Justin and McDaniel, Jennifer and Olson, Na "A complete reference genome improves analysis of human genetic variation" Science , v.376 , 2022 https://doi.org/10.1126/science.abl3533 Citation Details
Aganezov, Sergey and Zban, Ilya and Aksenov, Vitaly and Alexeev, Nikita and Schatz, Michael C. "Recovering rearranged cancer chromosomes from karyotype graphs" BMC Bioinformatics , v.20 , 2019 https://doi.org/10.1186/s12859-019-3208-4 Citation Details
Ahmed, Omar and Rossi, Massimiliano and Kovaka, Sam and Schatz, Michael C. and Gagie, Travis and Boucher, Christina and Langmead, Ben "Pan-genomic matching statistics for targeted nanopore sequencing" iScience , v.24 , 2021 https://doi.org/10.1016/j.isci.2021.102696 Citation Details
Alonge, M and Lebeigle, L and Kirsche, M and Aganezov, S and Wang, X and Lippman, ZB and Schatz, MC and Soyk, S "Automated assembly scaffolding elevates a new tomato system for high-throughput genome editing" bioRxiv , 2021 https://doi.org/10.1101/2021.11.18.469135 Citation Details
Alonge, Michael and Soyk, Sebastian and Ramakrishnan, Srividya and Wang, Xingang and Goodwin, Sara and Sedlazeck, Fritz J. and Lippman, Zachary B. and Schatz, Michael C. "RaGOO: fast and accurate reference-guided scaffolding of draft genomes" Genome Biology , v.20 , 2019 https://doi.org/10.1186/s13059-019-1829-6 Citation Details
Alonge, Michael and Wang, Xingang and Benoit, Matthias and Soyk, Sebastian and Pereira, Lara and Zhang, Lei and Suresh, Hamsini and Ramakrishnan, Srividya and Maumus, Florian and Ciren, Danielle and Levy, Yuval and Harel, Tom Hai and Shalev-Schlosser, Gil "Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato" Cell , 2020 https://doi.org/10.1016/j.cell.2020.05.021 Citation Details
Altemose, Nicolas and Logsdon, Glennis A. and Bzikadze, Andrey V. and Sidhwani, Pragya and Langley, Sasha A. and Caldas, Gina V. and Hoyt, Savannah J. and Uralsky, Lev and Ryabov, Fedor D. and Shew, Colin J. and Sauria, Michael E. and Borchers, Matthew an "Complete genomic and epigenetic maps of human centromeres" Science , v.376 , 2022 https://doi.org/10.1126/science.abl4178 Citation Details
Chen, Li-Yu Man and VanBuren, Robert Young and Paris, Margot C. and Zhou, Hongye C. and Zhang, Xingtan E. and Wai, Ching M. and Yan, Hansong D. and Chen, Shuai C. and Alonge, Michael L. and Ramakrishnan, Srividya and Liao, Zhenyang and Liu, Juan and Lin, "The bracteatus pineapple genome and domestication of clonally propagated crops" Nature Genetics , v.51 , 2019 10.1038/s41588-019-0506-8 Citation Details
Chen, Sai and Krusche, Peter and Dolzhenko, Egor and Sherman, Rachel M. and Petrovski, Roman and Schlesinger, Felix and Kirsche, Melanie and Bentley, David R. and Schatz, Michael C. and Sedlazeck, Fritz J. and Eberle, Michael A. "Paragraph: a graph-based structural variant genotyper for short-read sequence data" Genome Biology , v.20 , 2019 https://doi.org/10.1186/s13059-019-1909-7 Citation Details
Chou, Hsiang-Chen and Bhalla, Kuhulika and Demerdesh, Osama EL and Klingbeil, Olaf and Hanington, Kaarina and Aganezov, Sergey and Andrews, Peter and Alsudani, Habeeb and Chang, Kenneth and Vakoc, Christopher R and Schatz, Michael C and McCombie, W Richar "The human origin recognition complex is essential for pre-RC assembly, mitosis, and maintenance of nuclear structure" eLife , v.10 , 2021 https://doi.org/10.7554/eLife.61797 Citation Details
(Showing: 1 - 10 of 32)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page