Award Abstract # 1657296
CRII: CSR: Online Analysis of Disk I/O for Automatic Storage System Optimization

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF LOUISVILLE
Initial Amendment Date: February 16, 2017
Latest Amendment Date: July 28, 2017
Award Number: 1657296
Award Instrument: Standard Grant
Program Manager: Marilyn McClure
mmcclure@nsf.gov
 (703)292-5197
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: February 15, 2017
End Date: January 31, 2020 (Estimated)
Total Intended Award Amount: $175,000.00
Total Awarded Amount to Date: $191,000.00
Funds Obligated to Date: FY 2017 = $191,000.00
History of Investigator:
  • Nihat Altiparmak (Principal Investigator)
    nihat.altiparmak@louisville.edu
Recipient Sponsored Research Office: University of Louisville Research Foundation Inc
2301 S 3RD ST
LOUISVILLE
KY  US  40208-1838
(502)852-3788
Sponsor Congressional District: 03
Primary Place of Performance: University of Louisville
2301 South Third Street
Louisville
KY  US  40292-0001
Primary Place of Performance
Congressional District:
03
Unique Entity Identifier (UEI): E1KJM4T54MK6
Parent UEI:
NSF Program(s): CRII CISE Research Initiation,
Special Projects - CNS
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7354, 8228, 9150, 9251
Program Element Code(s): 026Y00, 171400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Today's critical applications, including genome analysis, climate simulations, drug discovery, space observation, and numerical simulations in computational chemistry and high-energy physics, are all data intensive in nature. Storage performance bottlenecks are major threats limiting the performance and scalability of data intensive applications. The goal of this project is to develop a general framework for self-optimizing parallel storage systems that can alleviate storage performance bottlenecks, and thus can have a considerable impact on society by accelerating the innovation process in a multitude of domains of science. The results will also advance the state of knowledge in storage systems by benefiting a wide range of parallel storage systems including disk arrays, key-value stores, and parallel/distributed file systems. Broader impacts include mentoring and training K-12 students through summer camps and promoting involvement of underrepresented students in science and engineering.

This research develops novel, theoretically grounded, and experimentally validated methods for online detection and automatic elimination of disk I/O bottlenecks. Specific goals include the development of: (i) new online methods for continuously monitoring disk I/O requests and analyzing them efficiently, guided by data stream mining and social network analysis theory, and (ii) automatically-triggered self-optimization techniques guided by bin packing, graph coloring, and network flow theory, which can carefully plan an adaptive data layout to improve disk I/O performance. Experimentation and validation using software simulation and prototype implementation are performed by analyzing what is theoretically possible, what can be achieved in practice, and trying to close the gap between the two.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Hall, Logan and Harris, Bryan and Tomes, Erica and Altiparmak, Nihat "Big Data Aware Virtual Machine Placement in Cloud Data Centers" Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies , 2017 10.1145/3148055.3148057 Citation Details
Harris, Bryan and Altiparmak, Nihat "Monte Carlo Based Server Consolidation for Energy Efficient Cloud Data Centers" 2019 IEEE International Conference on Cloud Computing Technology and Science (CloudCom) , 2019 10.1109/CloudCom.2019.00046 Citation Details
Harris, Bryan and Altiparmak, Nihat "Ultra-Low Latency SSDs' Impact on Overall Energy Efficiency" USENIX HotStorage '20 , 2020 Citation Details
Harris, Bryan and Marzullo, Michael and Altiparmak, Nihat "Real-Time Characterization of Data Access Correlations" 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) , 2021 https://doi.org/10.1109/ISPASS51385.2021.00031 Citation Details
Tomes, Erica and Altiparmak, Nihat "A Comparative Study of HDD and SSD RAIDs? Impact on Server Energy Consumption" 2017 IEEE International Conference on Cluster Computing (CLUSTER) , 2017 10.1109/CLUSTER.2017.103 Citation Details
Tomes, Erica and Rush, Everett Neil and Altiparmak, Nihat "Towards Adaptive Parallel Storage Systems" IEEE Transactions on Computers , v.67 , 2018 10.1109/TC.2018.2836426 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The goal of this project is to develop a general framework for self-optimizing data storage systems that can dynamically alleviate storage performance bottlenecks of applications. As part of its intellectual merits, this project developed new real-time disk I/O monitoring, online disk I/O analysis, and automatic storage performance optimization methods guided by association rule mining, graph coloring, bin packing, and network flow analysis. The results shared through publications, theses, and open source software advance the state of knowledge in data storage systems by dynamically alleviating storage performance bottlenecks of data intensive applications, and benefiting a wide range of parallel storage scenarios such as disk arrays, key-value stores, parallel and distributed file systems, and new generation solid-state storage devices with rich internal parallelism. In addition to automatic performance optimizations, power consumption optimizations are also investigated in the proposed designs for future energy-efficient storage systems. As part of its broader impacts, this grant together with its REU supplement partially supported three graduate students and three undergraduate students. Specific broader impact efforts included mentoring and training K-12 teachers through summer training programs, improving K-12 curriculum for computer science education, and promoting the involvement of women in science and engineering. The project resulted in five software modules specific to the developed disk I/O tracing, benchmarking, analysis, and optimization methods available through our lab's GitHub page, tutorials related to the developed software modules available through our lab's website, and educational materials for K-12 teacher training and curriculum improvement available online. All developed materials are released under free and open source license.

 


Last Modified: 05/31/2020
Modified by: Nihat Altiparmak

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page