
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | July 24, 2013 |
Latest Amendment Date: | July 24, 2013 |
Award Number: | 1346756 |
Award Instrument: | Standard Grant |
Program Manager: |
Nina Amla
namla@nsf.gov (703)292-7991 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | August 1, 2013 |
End Date: | July 31, 2016 (Estimated) |
Total Intended Award Amount: | $300,000.00 |
Total Awarded Amount to Date: | $300,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
201 PRESIDENTS CIR SALT LAKE CITY UT US 84112-9049 (801)581-6903 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
UT US 84112-9205 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | SOFTWARE ENG & FORMAL METHODS |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The nation's economic progress and national security are critically dependent on maintaining a trajectory of steady advances in computing. Such advances are crucially dependent on the use of multiple processing units that are programmed using parallel programming languages. As these parallel processing systems find uses in critical applications such as national security infrastructures, medical devices, airplanes, and computing installations that predict our weather, they must be highly reliable as well as energy efficient. Unfortunately, today's multi-processors are extremely difficult to reliably employ and to efficiently program using current parallel programming languages. In addition to the generally recognized difficulty of writing parallel programs, one of the central unresolved difficulties is the development of a clearly defined shared memory semantics that allows sufficient parallelism. This semantics dictates how the computing elements exchange data, as well as how compilers can safely optimize parallel programs.
This work primarily focuses on addressing critical problems relating to concurrent shared memory interactions in parallel programs. It helps advance the current state of the art by developing a collection of mathematical models for clearly defining these interactions. These mathematical models will form the bedrock for developing parallel processors as well as compilers that reliably translate user intentions into correctly functioning computing systems. A central emphasis of our work is that it uniformly addresses the multiplicity of parallel processing element types as well as computer languages by erecting these mathematical models based on a Concurrency Intermediate Verification Language. An equally important feature of this work is that this understanding of memory consistency models directly translates into rigorous error-checking tools to avoid egregious mistakes in deployed computer software. A key aspect of this project is the development of such error-checking tools for parallel programs and demonstrating the effectiveness of these tools on realistic programs acquired from national labs and industrial partners.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Advances in computing crucially dependent on the use of multiple processing units that are programmed using parallel programming languages. As these parallel processing systems find uses in critical applications such as national security infrastructures, medical devices, airplanes, and computing installations that predict our weather, they must be highly reliable as well as energy efficient. Unfortunately, today’s parallel programming languages as well as multi-processors are extremely difficult to reliably employ, as well as efficiently program. In addition to the generally recognized difficulty of writing parallel programs, one of the central unresolved difficulties is that of a clearly defined and well specified shared memory semantics, which dictates how the computing elements exchange data, as well as how compilers can safely optimize parallel programs.
The main contribution of this project was exploring topics related to the analysis and verification of parallel programs while taking the semantics of shared memory into account. We devised several main contributions. First, we developed a formal model for a common and widely used parallel programming paradigm called pthreads. We implemented this model as an extension of our software verifier called SMACK. Using this extension, we were able to verify numerous real-world parallel programs, and in particular we focused on Linux device drivers. In the process, our tool automatically discovered several bugs related to parallelism and shared memory. Also, we competed twice in the annual software verification competition, and won several medals both times.
As our second main thrust, we carefully studied how shared memory operates in graphics processing units (GPUs). GPUs achieve higher computational efficiency than CPUs by employing simpler cores, and hide memory latency by switching away from stalled threads. GPUs also follow a weak memory consistency model, and as such are very hard to program correctly. While CPUs also have such weak memory rules, the rules followed by GPUs are much less well understood. As a part of this project, we subjected a number of different GPU architecture to targeted stress-testing in order to observe potentially erroneous memory behaviors. In the process, we have shown that GPU code can in fact end up fetching stale data, which was not what programmers would expect.
Building on top of the lessons learned from our second thrust, we realized that despite the growing popularity of GPU parallel programming, there is a lack of portable synchronization primitives that are needed to perform certain kinds of parallel computations. Hence, as our final main contribution, we developed such a portable synchronization primitive. We empirically and formally validated its correctness under weak memory models as implemented by various GPU architectures. Since then, we have been working on trying to incorporate our primitive into the next GPU programming language standard, in order to make it available to a multitude of GPU programmers.
Finally, we explored improving the core search algorithm of the CIVL verification engine. We implemented several search strategies and we empirically evaluated them on a number of CIVL benchmarks. Our prototype implementation showed promising results, and we hope that it will be incorporated into the main CIVL engine.
We disseminated our research results by publishing around ten research papers and reports. We also implemented tools and extensions stemming from this research, and we disseminated those to the academic community as well as practitioners in national labs and industry. We made all of the developed tools publicly available and open-source projects. As a part of this project, we also trained a generation of students and post-doctoral researchers in the areas of parallel programming and shared memory semantics where the current shortage of such manpower is crippling national productivity as well as innovation.
Last Modified: 10/26/2016
Modified by: Zvonimir Rakamaric
Please report errors in award information by writing to: awardsearch@nsf.gov.