
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | July 20, 2015 |
Latest Amendment Date: | August 18, 2016 |
Award Number: | 1533828 |
Award Instrument: | Standard Grant |
Program Manager: |
Anindya Banerjee
abanerje@nsf.gov (703)292-7885 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | August 1, 2015 |
End Date: | July 31, 2020 (Estimated) |
Total Intended Award Amount: | $560,000.00 |
Total Awarded Amount to Date: | $575,876.00 |
Funds Obligated to Date: |
FY 2016 = $15,876.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
1400 TOWNSEND DR HOUGHTON MI US 49931-1200 (906)487-1885 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
1400 Townsend Drive Houghton MI US 49931-1295 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
Software & Hardware Foundation, Exploiting Parallel&Scalabilty |
Primary Program Source: |
01001617DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Title: XPS: Full: FP: Collaborative Research: Sphinx: Combining Data and Instruction Level Parallelism through Demand Driven Execution of Imperative Programs
It has become increasingly difficult to improve the performance of processors so that they can meet the demands of existing and emerging workloads. Recent emphasis has been towards enhancing the performance through the use of multi-core processors and Graphics Processing Units. However, these processors remain difficult to program and inflexible to adapt to dynamic changes in the available parallelism in a given program. Although the computer architecture and programming language community continues to innovate and make important gains towards better programmability and better designs, it remains that parallel programming is inherently costly and error prone, and automatic parallelization of programs is not always feasible or effective. The intellectual merits of this project are the development of a new program execution paradigm and the establishment of critical compiler and micro-architecture mechanisms so that one can design processors that can be easily programmed using existing programming languages and at the same time surpass the performance of existing parallel computers. The project's broader significance and importance are wide-spread: the deployment of such processors will push the limits of computation in every field of science and commerce.
The execution paradigm under consideration is a previously unexplored execution model, the demand-driven execution of imperative programs (DDE). The DDE paradigm rests on a solid theoretical framework and promises to efficiently deliver very high-levels of fine-grain parallelism. This parallelism is extracted from a program written in an imperative language such as C, and it is realized by means of an effective compiler-architecture collaboration mechanism using a common, single-assignment form for the program representation. DDE processors can extract instruction-level parallelism much more efficiently than existing superscalar processors as the paradigm does not require dynamic dependency checking. Such processors can fetch, buffer, and execute many more instructions in parallel than current superscalar processors. Owing to its dependence-driven instruction fetching and execution, the paradigm leads to extremely scalable designs, as the communication is naturally localized and synchronization is inherent in the model. Conventional thread-level parallelism (TLP) is orthogonal to DDE, and thus DDE designs can exploit both ILP and TLP. DDE architectures thus represent promising building blocks for extreme-scale machines.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Traditional computing relies on sequential processing of machine instructions which forms the basis for computing in every aspect of our lives. Further speeding up the execution of programs requires the use of parallelism and parallel execution of programs under the sequential execution model requires an extensive effort to develop and tune parallel programs, which are more prone to bugs and failures.
The goal of the project is to develop an alternative execution model called demand-driven execution of imperative programs. In this model, programs are automatically translated to an internal representation which permits executing them starting with their outputs progressing towards their inputs and computing only what is necessary and automatically in parallel. Our project has developed the model, and developed the compiler technology to convert imperative programs written in a conventional imperative programming language, such as C or C++. We have also developed the processor designs which can efficiently execute the transformed programs.
Since such a drastic change in program execution model requires the entire software stack to be developed, we cannot claim immediate and wide-spread use of this technology at this point. However, our project has demonstrated that we can automatically transform programs into this new paradigm and develop processors which can efficiently execute these programs. Further work on this approach may potentially provide significant speed-ups compared to conventional computing. Attached grpahs show the performance of our approach for a limited set of Livermore kernels which our compiler can compile.
Last Modified: 11/29/2020
Modified by: Soner Onder
Please report errors in award information by writing to: awardsearch@nsf.gov.