Award Abstract # 1626364
MRI: Acquisition of the Kentucky Research Informatics Cloud (KyRIC)

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: UNIVERSITY OF KENTUCKY RESEARCH FOUNDATION, THE
Initial Amendment Date: August 12, 2016
Latest Amendment Date: February 14, 2019
Award Number: 1626364
Award Instrument: Standard Grant
Program Manager: Alejandro Suarez
alsuarez@nsf.gov
 (703)292-7092
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2016
End Date: July 31, 2020 (Estimated)
Total Intended Award Amount: $2,240,000.00
Total Awarded Amount to Date: $2,240,000.00
Funds Obligated to Date: FY 2016 = $2,240,000.00
History of Investigator:
  • James Griffioen (Principal Investigator)
    griff@netlab.uky.edu
  • Hunter Moseley (Co-Principal Investigator)
  • GQ Zhang (Former Principal Investigator)
  • James Griffioen (Former Co-Principal Investigator)
  • Vincent Kellen (Former Co-Principal Investigator)
  • Christina Payne (Former Co-Principal Investigator)
Recipient Sponsored Research Office: University of Kentucky Research Foundation
500 S LIMESTONE
LEXINGTON
KY  US  40526-0001
(859)257-9420
Sponsor Congressional District: 06
Primary Place of Performance: University of Kentucky Research Foundation
725 Rose Street, MDL Room 238
Lexington
KY  US  40536-0082
Primary Place of Performance
Congressional District:
06
Unique Entity Identifier (UEI): H1HYA8Z1NTM5
Parent UEI:
NSF Program(s): Major Research Instrumentation,
CYBERINFRASTRUCTURE
Primary Program Source: 01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1189, 9150
Program Element Code(s): 118900, 723100
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This project will create a big data cloud infrastructure, the Kentucky Research Informatics Cloud (KyRIC), to accelerate data-driven discovery and computational research education across multiple disciplines. Scientific discovery today is being enabled through computational and data intensive research that exploits enormous amounts of available data. KyRIC will advance a number of exciting research programs across many disciplines, such as Bioinformatics and System Biology Algorithms, Large Graph and Evolutionary Network Analysis, Image Processing, and Computational Modeling and Simulation. Breakthroughs in KyRIC-enabled research will have important societal benefits in a number of areas, such as increasing agricultural yields, improving economic competitiveness, and creating new products and markets.

KyRIC will use a hybrid architecture to support massively parallel applications that will address exciting and challenging new data and memory intensive research in big data science. The KyRIC hybrid system will consists of two subsystems: a 50 nodes cluster, each with 4 10-core processors, 3TB RAM, and an 8TB SSD array; and a Peta-scale storage system providing 2 PB of object-based storage. KyRIC will employ leading-edge cloud management software that will allow nodes to be reconfigured, scheduled, and loaded with problem-specific applications software based on the current mix of big data jobs being executed by users. As a result, the project will enable and support a wide range of new research activities, each with its own unique characteristics that are beyond the capacity of our existing infrastructure. KyRIC will be readily accessed by researchers across the state utilizing our latest high-performance network, with multiple 100GB/s links from Lexington to Louisville and Cincinnati. KyRIC will also join XSEDE to better integrate the University of Kentucky (UK) with national multi-petascale capabilities.

KyRIC will provide intuitive access, rapid infrastructure customizations, and higher bandwidth and lower latency between the desktop and resources like XSEDE to facilite improved algorithm design, software development, and interactive data analysis. KyRIC will be used by over 1000 UK researchers (faculty, staff, and students) and by computational research collaborators across the state of Kentucky, notably University of Louisville (UL), Northern Kentucky University (NKU), and Kentucky State University (KSU). The resource will make exciting data-intensive projects possible, enhance computational research education for graduate and undergraduate students, help attract and retain talented younger faculty, and promote big data science and technology, thus impacting Kentucky's and the nation's economic development.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 11)
Cui L, Zhu W, Tao S, Case JT, Bodenreider O, Zhang GQ "Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT" J Am Med Inform Assoc , v.24 , 2017 , p.788 10.1093/jamia/ocw175
Cui L, Zhu W, Tao S, Case JT, Bodenreider O, Zhang GQ "Mining Non-Lattice Subgraphs for Detecting Missing Hierarchical Relations and Concepts in SNOMED CT" Journal of the American Medical Informatics Association , 2017 , p.doi:10.10 doi:10.1093/jamia/ocw175
GQ Zhang, G Xing, L Cui "An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies" Journal of biomedical informatics , v.80 , 2018 , p.106 doi.org/10.1016/j.jbi.2018.03.004
GQ Zhang, Licong Cui, Remo Mueller, Shiqiang Tao, Matthew Kim, Michael Rueschman, Sara Mariani, Daniel Mobley, Susan Redline "The National Sleep Research Resource: towards a sleep data commons" Journal of the American Medical Informatics Association , 2018 10.1093/jamia/ocy064
Joshua M. Mitchell, Robert M. Flight, and Hunter N.B. Moseley "Small Molecule Isotope Resolved Formula Enumeration: a Methodology for Assigning Isotopologues and Metabolites in Fourier Transform Mass Spectra" Analytical Chemistry , v.91 , 2019 10.1021/acs.analchem.9b00748
L Cui, O Bodenreider, J Shi, GQ Zhang "Auditing SNOMED CT hierarchical relations based on lexical features of concepts in non-lattice subgraphs" Journal of biomedical informatics , v.78 , 2018 , p.177 10.1016/j.jbi.2017.12.010
Licong Cui, Ningzhou Zeng, Matthew Kim, Remo Mueller, Emily Hankosky, SusanRedline, and GQ Zhang "X-search: an open access interface forcross-cohort exploration of the National SleepResearch Resource" BMC Biomedical Informatics and Decision Making , 2018 10.1186/s12911-018-0682-y
Satrio Husodo, Jacob Chappell, Vikram Gazula, Lowell Pike, James Griffioen "Slicing and Dicing OpenHPC Infrastructure: Virtual Clusters in OpenStack" Practice and Experience in Advanced Research Computing (PEARC 19) , 2019 https://doi.org/10.1145/3332186.3332214
Shiqiang Tao, Licong Cui, Xi Wu, and Guo-Qiang Zhang "Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories" AMIA Annual Symposium 2017 , 2018
Wei Zhu, Licong Cui, GQ Zhang "Spark-MCA: Large-scale, Exhaustive Formal Concept Analysis for Evaluating the Semantic Completeness of SNOMED CT" Annual Symposium of the American Medical Informatics Association (AMIA) , 2018 , p.1931 PMCID: PMC5977568
Yuriko Katsumata, David W Fardo "Quantitative phenotype scan statistic (QPSS) reveals rare variant associations with Alzheimer's disease endophenotypes" BMC medical genetics , 2020 10.1186/s12881-020-01046-6
(Showing: 1 - 10 of 11)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Scientific discoveries are increasingly being driven by big data.  By its nature, research based on big data is compute and data intensive, requiring significant computational power, data storage, and networking capabilities.  Given the massive (and rapidly increasing) amounts of data available to researchers, the demand for advanced cyberinfrastructure capable of supporting big data research continues to grow.  For example, computational research based on massive image data sets, genome data sets, measurement and monitoring data sets, or data sets produced by simulation and modeling are of increasing importance and often require specialized computational infrastructure to deal with the massive data sizes.

To address the unique computational and data intensive requirements of big data research, this project developed and deployed a specialized computational infrastructure, called the Kentucky Research Informatics Cloud (KyRIC), capable of supporting a wide range of research problems involving big data.  The KyRIC system consists of 50 nodes where each node has a large amount of memory (3 terabytes of RAM), substantial compute power (40 CPU cores), significant local high-speed solid state disk storage, and is connected to a massive external disk storage space (~2 petabytes of disk storage).  The system has high-speed network connections to move data between the components as well as to/from the Internet. This combination of advanced cyberinfrastructure in KyRIC has enabled a wide variety of research projects that were not possible using the existing high performance computing (HPC) environments. 

To support a wide range of researchers and their individual big data computational requirements, the KyRIC infrastructure was developed and deployed as a local cloud infrastructure supporting virtual machines (VMs), where each VM can be tailored to the needs of a specific research problem.  KyRIC's ability to support VMs with very large memories enables exploration of problems in genomics, image processing, and machine learning that are often prohibitive on a conventional HPC supercomputer or commercial cloud provider.  Moreover, the ability to "right-size" the computational infrastructure to match researchers' problem sizes results in highly efficient use of KyRIC's limited resources.  The Kentucky team also developed a system to dynamically create virtual HPC clusters on the KyRIC system, making it possible for conventional HPC users with growing data sets to migrate their workloads to the KyRIC cloud environment.

The KyRIC system has been used by researchers from many different departments and colleges across the university addressing vastly different research problems with a wide range of big data requirements.  Research results using KyRIC have appeared in journal and conference publications, and a description of KyRIC's new virtual cluster capabilities was presented at a conference to cyberinfrastructure professionals.  In addition, the project provided a valuable learning experience to staff involved in designing, deploying, and operating the KyRIC system as well as to new users unfamiliar with cloud computing environments.


Last Modified: 11/27/2020
Modified by: James N Griffioen

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page