Skip to feedback

Award Abstract # 1250264
EAGER: High Performance Algorithms and Implementatations for Genome Alignment

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF ILLINOIS
Initial Amendment Date: August 25, 2012
Latest Amendment Date: August 25, 2012
Award Number: 1250264
Award Instrument: Standard Grant
Program Manager: Marilyn McClure
mmcclure@nsf.gov
 (703)292-5197
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2012
End Date: July 31, 2014 (Estimated)
Total Intended Award Amount: $200,000.00
Total Awarded Amount to Date: $200,000.00
Funds Obligated to Date: FY 2012 = $54,336.00
History of Investigator:
  • Ashfaq Khokhar (Principal Investigator)
    ashfaq@iastate.edu
  • Fahad Saeed (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Illinois at Chicago
809 S MARSHFIELD AVE M/C 551
CHICAGO
IL  US  60612-4305
(312)996-2862
Sponsor Congressional District: 07
Primary Place of Performance: University of Illinois at Chicago
IL  US  60607-7053
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): W8XEAJDKMXH3
Parent UEI:
NSF Program(s): CSR-Computer Systems Research
Primary Program Source: 01001213DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7916
Program Element Code(s): 735400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Analysis of biological sequences, including multiple sequence alignment, motif finding, and genome alignment, is a fundamental problem in computational biology due to its critical significance in wide ranging applications including haplotype reconstruction, sequence homology, phylogenetic analysis, and prediction of evolutionary origins. Most of the sequence analysis problem formulations (particularly those related to alignment) are considered NP-hard. Existing solutions to the sequence alignment problem (both sequential as well as parallel) are extremely limited in their applicability and yield poor performance for large data sets. Moreover most of these solutions have been designed for aligning short length sequences. The genome alignment problem (very long sequences) is significantly harder and very few solutions exist that are capable to construct genomes from short reads while taking significant amount of execution time. This project deals with the design and development of high performance algorithms and implementations for aligning genomes using innovative sampling and domain decomposition strategies. This approach has never been pursued for genome alignment in the past. The proposed algorithms are implemented on hybrid computing platforms consisting of multicore clusters and GPU units.

This project brings together tools and applications from multiple disciplines such as bioinformatics, computational biology, statistics, and high performance computing. Therefore the findings will introduce new tools for biology and biomedical applications. It will facilitate rapid reconstruction of genomes and mapping of short reads to the corresponding haplotypes.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Boyang Zhao, Trairak Pisitkun, Jason D. Hoffert, Mark A. Knepper, and Fahad Saeed "CPhos: a program to calculate and visualize evolutionarily conserved functional phosphorylation sites" PROTEOMICS , v.12 , 2012 , p.3299-3303

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page