
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | February 16, 2017 |
Latest Amendment Date: | July 28, 2017 |
Award Number: | 1657296 |
Award Instrument: | Standard Grant |
Program Manager: |
Marilyn McClure
mmcclure@nsf.gov (703)292-5197 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | February 15, 2017 |
End Date: | January 31, 2020 (Estimated) |
Total Intended Award Amount: | $175,000.00 |
Total Awarded Amount to Date: | $191,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
2301 S 3RD ST LOUISVILLE KY US 40208-1838 (502)852-3788 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
2301 South Third Street Louisville KY US 40292-0001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
CRII CISE Research Initiation, Special Projects - CNS |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Today's critical applications, including genome analysis, climate simulations, drug discovery, space observation, and numerical simulations in computational chemistry and high-energy physics, are all data intensive in nature. Storage performance bottlenecks are major threats limiting the performance and scalability of data intensive applications. The goal of this project is to develop a general framework for self-optimizing parallel storage systems that can alleviate storage performance bottlenecks, and thus can have a considerable impact on society by accelerating the innovation process in a multitude of domains of science. The results will also advance the state of knowledge in storage systems by benefiting a wide range of parallel storage systems including disk arrays, key-value stores, and parallel/distributed file systems. Broader impacts include mentoring and training K-12 students through summer camps and promoting involvement of underrepresented students in science and engineering.
This research develops novel, theoretically grounded, and experimentally validated methods for online detection and automatic elimination of disk I/O bottlenecks. Specific goals include the development of: (i) new online methods for continuously monitoring disk I/O requests and analyzing them efficiently, guided by data stream mining and social network analysis theory, and (ii) automatically-triggered self-optimization techniques guided by bin packing, graph coloring, and network flow theory, which can carefully plan an adaptive data layout to improve disk I/O performance. Experimentation and validation using software simulation and prototype implementation are performed by analyzing what is theoretically possible, what can be achieved in practice, and trying to close the gap between the two.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The goal of this project is to develop a general framework for self-optimizing data storage systems that can dynamically alleviate storage performance bottlenecks of applications. As part of its intellectual merits, this project developed new real-time disk I/O monitoring, online disk I/O analysis, and automatic storage performance optimization methods guided by association rule mining, graph coloring, bin packing, and network flow analysis. The results shared through publications, theses, and open source software advance the state of knowledge in data storage systems by dynamically alleviating storage performance bottlenecks of data intensive applications, and benefiting a wide range of parallel storage scenarios such as disk arrays, key-value stores, parallel and distributed file systems, and new generation solid-state storage devices with rich internal parallelism. In addition to automatic performance optimizations, power consumption optimizations are also investigated in the proposed designs for future energy-efficient storage systems. As part of its broader impacts, this grant together with its REU supplement partially supported three graduate students and three undergraduate students. Specific broader impact efforts included mentoring and training K-12 teachers through summer training programs, improving K-12 curriculum for computer science education, and promoting the involvement of women in science and engineering. The project resulted in five software modules specific to the developed disk I/O tracing, benchmarking, analysis, and optimization methods available through our lab's GitHub page, tutorials related to the developed software modules available through our lab's website, and educational materials for K-12 teacher training and curriculum improvement available online. All developed materials are released under free and open source license.
Last Modified: 05/31/2020
Modified by: Nihat Altiparmak
Please report errors in award information by writing to: awardsearch@nsf.gov.