
NSF Org: |
ECCS Division of Electrical, Communications and Cyber Systems |
Recipient: |
|
Initial Amendment Date: | August 28, 2012 |
Latest Amendment Date: | August 28, 2012 |
Award Number: | 1255425 |
Award Instrument: | Standard Grant |
Program Manager: |
Radhakisan Baheti
ECCS Division of Electrical, Communications and Cyber Systems ENG Directorate for Engineering |
Start Date: | August 16, 2012 |
End Date: | June 30, 2016 (Estimated) |
Total Intended Award Amount: | $224,928.00 |
Total Awarded Amount to Date: | $224,928.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
660 S MILL AVENUE STE 204 TEMPE AZ US 85281-3670 (480)965-5479 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
AZ US 85281-6011 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | EPCN-Energy-Power-Ctrl-Netwrks |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.041 |
ABSTRACT
Cloud computing services (such as Amazon EC2 system, Google AppEngine, and Microsoft Azure) are becoming ubiquitous and are starting to serve as the primary source of computing power for both enterprises and personal computing applications. A cloud computing platform (or simply, a cloud) can provide a variety of resources, including infrastructure, software, and services, to users in an on-demand fashion. Compared to traditional own-and-use approaches, cloud computing services eliminate the costs of purchasing and maintaining the infrastructures for cloud users, and allow the users to dynamically scale up and down computing resources in real time based on their needs.
A cloud consists of a number of machines (computers), each with a certain amount of resources (CPU, RAM, hard disk space, etc.). Each machine can be subdivided into virtual machines, where each virtual machine (VM) behaves like a small machine with a certain amount of dedicated resources. When a user submits a job to the cloud, he or she requests a certain amount of resources from the cloud and the cloud responds by creating a VM with the required resources in a machine. The resource allocation problem is to figure out how to allocate jobs to machines. Further, when several jobs are waiting for service, the cloud must also decide which job to select for service next. The goal of this project is to design resource allocation algorithms for efficient operation of the cloud, and to design pricing mechanisms to maximize the cloud service provider revenue while providing good quality of service to competing users.
Intellectual Merit: The prior art in this area is to view the problem as a sequence of static problems as follows: consider the jobs that are currently waiting for service and allocate them to machines by solving a combinatorial optimization problem. Static approaches which ignore the dynamic nature of the system lead to instability. Our viewpoint here is fundamentally different: we consider the resource allocation problem as a dynamic stochastic network control problem. We will use queue length information about waiting jobs as the feedback signal to take resource allocation decisions such as routing jobs to machines and scheduling jobs on machines. To this end, we will answer a number of fundamental questions: what is the stability region of a cloud? ; is there a tradeoff between computational complexity and stability? ; how can we characterize the performance of resource allocation algorithms beyond stability? ; and how should a cloud provider price its resources for maximizing social welfare or profit? From a theoretical perspective, the novelty in the proposed approach lies in the design of control and performance analysis algorithms while taking computational complexity considerations in account.
Broader Impact: The PIs teach graduate-level courses spanning networks, games, control theory, and optimization. We were among the first to incorporate network applications in control courses and control applications in networking classes. The proposed project provides new opportunities for such cross-fertilization by opening up a new application area, namely cloud computing, for control-theoretic methodologies. The PIs have a strong record of advising undergraduate students and graduate students from underrepresented groups. We will continue our recruitment efforts from such student groups for this project also. We will also use specific opportunities for this purpose as applicable, such as the NSF Alliance Graduate Education and the Professoriate (AGEP) program, which is a coordinated effort by Iowa universities to recruit minorities, and the Graduate Minority Assistantship Program (GMAP), which provides funds for recruiting minority students on research assistantships.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Cloud computing services (such as Amazon EC2 system, Google's AppEngine, and Microsoft's Azure) are starting to serve as the primary source of computing power for both enterprises and personal computing applications. A cloud computing platform can provide a variety of resources, including infrastructure, software, and services, to users in an on-demand fashion. Compared to traditional “own-and-use” approaches, cloud computing services eliminate the costs of purchasing and maintaining the infrastructures for cloud users, and allow the users to dynamically scale up and down computing resources in real time based on their needs.
This project focused on stochastic modeling and the design of high performance resource allocation algorithms. During the course of this project, we developed several stochastic models for resource allocation problems in cloud computing clusters, including virtual machine placement and data parallel computing using MapReduce/Hadoop. A key insight of our designs was to use queue length information about waiting jobs/tasks as the feedback signal to take resource allocation decisions such as routing jobs to machines and scheduling computation/data-processing on machines. Based on the stochastic models, we quantified the fundamental capacity regions of cloud computing clusters. Using the insight of queue-length-based resource allocation, we developed dynamic resource allocation algorithms that achieve the capacity regions for both the virtual machine placement problem and the Map-task scheduling problem in MapReduce/Hadoop. We have also developed a theoretical tool based on the mean-field analysis and Stein’s method to analyze the performance of large-scale computing clusters and to quantify the asymptotic delay performance. By combining Stein’s method and the perturbation theory in control, the tool we developed has been used to quantify the approximation error of the mean-field approximation for finite size systems
The project also informed and educated both students and the broader society with curriculum development on courses on computer networking and outreach. The project also played a role in the integration of under-represented groups to the scientific community. A female Ph.D. student was involved in the project to develop capacity-achieving resource allocation algorithms. One paper that developed a new randomized load balancing algorithm for cloud computing clusters has been awarded the best paper award at INFOCOM 2015.
Last Modified: 08/10/2016
Modified by: Lei Ying
Please report errors in award information by writing to: awardsearch@nsf.gov.