Award Abstract # 1421585
CSR: Small: Collaborative Research: Enhancing Cloud Performance With On-Demand Isolation

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF PITTSBURGH - OF THE COMMONWEALTH SYSTEM OF HIGHER EDUCATION
Initial Amendment Date: August 26, 2014
Latest Amendment Date: August 26, 2014
Award Number: 1421585
Award Instrument: Standard Grant
Program Manager: Marilyn McClure
mmcclure@nsf.gov
 (703)292-5197
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2014
End Date: September 30, 2018 (Estimated)
Total Intended Award Amount: $249,924.00
Total Awarded Amount to Date: $249,924.00
Funds Obligated to Date: FY 2014 = $249,924.00
History of Investigator:
  • John Lange (Principal Investigator)
    jacklange@cs.pitt.edu
Recipient Sponsored Research Office: University of Pittsburgh
4200 FIFTH AVENUE
PITTSBURGH
PA  US  15260-0001
(412)624-7400
Sponsor Congressional District: 12
Primary Place of Performance: University of Pittsburgh
Pittsburgh
PA  US  15213-2303
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): MKAGLD59JRL1
Parent UEI:
NSF Program(s): CSR-Computer Systems Research
Primary Program Source: 01001415DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923
Program Element Code(s): 735400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The modern trend in computing systems is towards system architectures containing a large numbers of heterogeneous computational and I/O resources. Unfortunately, while the increase in scale allows increased workload consolidation, wherein a single system runs multiple independent applications in parallel, it does come at the cost of introducing increased interference across the different application workloads. Workload interference is the result of the behavior of one application impacting the performance of another, even if both applications are running on different hardware resources. This can be due to contention on shared hardware resources (such as last level caches, memory controllers, or I/O devices) or even software resources managed by the operating system. Cross workload interference is especially problematic for large scale shared infrastructures such as cloud hosting services, which rely on co-hosting large numbers of widely disparate workloads inside a single datacenter environment. Preventing interference effects is critical for cloud computing to fully deliver on its promise as a universal computing substrate.

This project addresses the problem of cross workload interference, by providing a holistic system that both detects the impact of interference on applications and mitigates its effects by providing dynamic isolation capabilities in the underlying system software. This approach relies on the ability to dynamically partition the underlying hardware resources such that isolation is achieved at the hardware layer, while also allowing the partitioning of system software to avoid contention on more abstract resources present in the system software itself. To achieve these goals this work implements a "Virtual Platform" abstraction representing an individual and isolatable system domain assigned to a particular task or workload and consisting of one or more virtual machine instances. The virtual platform itself is assigned an allocation of hardware resources consisting of independent "isolatable units." These units are created through the decomposition of local hardware resources into the finest grained subdivision of resources that can be both individually allocated and effectively isolated from the rest of the system. While providing partitioned hardware resources to a virtual platform provides hardware level isolation, it does not address interference generated by the system software. Avoiding software level interference is achieved by partitioning the system software itself through Multi-Stack Virtualization. Multi-stack virtualization allows multiple independent system software layers to co-exist on the same local system by restricting their managed resources to the set allocated to a virtual platform. Taken together this system provides full isolation capabilities at both the hardware and software layers.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

B. Kocoloski and J. Lange "Varbench: an Experimental Framework to Measure and Characterize Performance Variability" Proceedings of the 47th International Conference on Parallel Processing , 2018

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Cloud computing is now pervasive throughout human society, and hosts many of the services that the public takes for granted. In order for cloud computing to maximize its utility as a broad computational resource, it needs to provide appropriate environments for a wide range of applications. Our goal is to enable cloud systems to reach their potential by effectively meeting the demands of a class of applications that has historically not been well suited to cloud environments. In this project we sought to address the shortcomings of current approaches in order to extend cloud capabilities to support a wider array of potential users throughout society. This specific project undertook the challenge of adapting current cloud platforms for use in specialized and high performance settings. These capabilities would extend cloud capabilities to support a user base that currently relies on dedicated computing resources at government supported High Performance Computing (HPC) and supercomputing centers. Enabling this vision would allow greater access to HPC resources for users who currently lack the access or means to deploy their software on specialized environments. 

 

As part of this project we designed and demonstrated a number of novel tools and features to address these goals. These artifacts have been incorporated into existing research code bases as well as provided input into industrial product development efforts. Insights developed as part of this project have been incorporated into upcoming system software environments targetting next generation HPC systems. In addition this project has supported the training of a number of graduate students who graduated and moved into both industrial and academic positions. 


Last Modified: 03/23/2019
Modified by: John R Lange

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page