
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | October 14, 2015 |
Latest Amendment Date: | October 14, 2015 |
Award Number: | 1600669 |
Award Instrument: | Standard Grant |
Program Manager: |
Marilyn McClure
mmcclure@nsf.gov (703)292-5197 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | August 21, 2015 |
End Date: | August 31, 2016 (Estimated) |
Total Intended Award Amount: | $222,526.00 |
Total Awarded Amount to Date: | $222,526.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
506 S WRIGHT ST URBANA IL US 61801-3620 (217)333-2187 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
1901 S. First St. Suite A Champaign IL US 61820-7473 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CSR-Computer Systems Research |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The performance of computers has improved tremendously in the past four decades, which has enabled innumerable applications that have major roles in our daily lives. However, without dramatic innovations in improving performance and power efficiency of computing, the continued semiconductor device scaling alone will fail to provide computing capabilities needed for future applications. One of the main performance bottlenecks of traditional computing systems has been the high cost of communications between central processing unit (CPU) and graphics processing unit (GPU). The on-chip integration of CPU and GPU dramatically reduces the cost of communications, but it also worsens power, thermal, and bandwidth issues for chip design. Nonetheless, it also allows new approaches to be explored that previously were not practical. Given the potential and challenges of on-chip integrated CPU+GPU processors, this project undertakes a multidisciplinary effort to improve performance and power efficiency of computers. Specifically, the project aims to (i) develop runtime algorithms for scheduling workload and memory accesses under power, thermal, bandwidth constraints; (ii) explore micoarchitectures for improving memory system performance under bandwidth constraints; and (iii) optimize heterogeneous technology choices for integrated CPU+GPU processors.
This project is expected to have significant impact on the technology, circuit, architecture, and runtime system communities, and it is leading to state-of-the-art research infrastructure. The project also contributes state-of-the art workforce training. The outcomes of this project benefit economic growth through technology advances that will provide increased computing capability at a lower cost.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The computing industry is at a cross-roads today because scaling of silicon technology, which has been a major driving force for enabling high-performance computing, is rapidly approaching the fundamental limits. As Gordon Moore recently noted, “No exponential is forever, but can be only delayed.” On the other hand, the potentials of new emerging technologies are yet to be explored and these technologies are not mature enough to be economically used for mass production of computing devices. From an architecture point of view, a heterogeneous computing system comprised of heterogeneous processors such as CPU, GPU, and accelerators has emerged as a plausible practical solution to meet increasing performance demand under power, thermal, and bandwidth constraints.
Faced with such challenges and opportunities, in this project we aim to dramatically improve performance and energy-efficiency of heterogeneous computing systems with various techniques cutting across multiple levels of computing stacks (i.e., device, circuit, architecture, and runtime algorithm). Toward this goal, we first developed an architectural simulator to evaluate the performance of heterogeneous computing systems and released the simulator to the public (http://cpu-gpu-sim.ece.wisc.edu). Second, we developed a model that can evaluate the energy consumption of GPUs, which is an integral component of heterogeneous computing systems with two other collaborators from the University of Texas and University of British Columbia, and released it to the public (http://www.gpgpu-sim.org/gpuwattch). This energy model is currently the de facto model to evaluate the energy consumption of GPUs and the widely used model in the world, benefiting many researchers around the world. According to Google Scholar, more than 200 research papers have used this model. Third, based on these simulator and model we have developed many architecture and runtime algorithms to improve performance and energy efficiency of heterogeneous computing systems. For example, we developed a runtime algorithm that jointly adapts the operating voltage/frequency, the number of cores, and the workload allocated to the CPU and the GPU in a heterogeneous computing system. This algorithm allows us to maximize the performance under a given power constraint. We also developed various architectures that exploit the similarity of values that are processed by GPUs to further improve performance and energy efficiency while exploiting some unique characteristics of emerging applications running on heterogeneous computing systems. Moreover, we explored a practical but innovative heterogeneous computing architecture that moves computations near the memory considering the bandwidth constraint and energy efficiency, as it becomes more expensive to move data from memory to processors than processing data. This recent work, which significantly improves performance and energy consumption compared to traditional heterogeneous computing systems, has been widely cited by many recently published research papers in very a short time period.
Lastly, this project allowed us to support and train graduate students including under-represented ones. They are now working for large companies in the U.S.A. and contributing to developing the next-generation, state-of-the-art computers that will ensure the U.S.A. to maintain the leading position in computing and economy. Besides, the products of this project allowed us to enhance the contents of undergraduate and graduate-level computer architecture courses such that the students can learn state-of-the-art computing technologies and be better prepared for their jobs in industry and academia.
Last Modified: 11/29/2016
Modified by: Nam Sung Kim
Please report errors in award information by writing to: awardsearch@nsf.gov.