
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | August 1, 2017 |
Latest Amendment Date: | August 1, 2017 |
Award Number: | 1725672 |
Award Instrument: | Standard Grant |
Program Manager: |
Anindya Banerjee
abanerje@nsf.gov (703)292-7885 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | August 15, 2017 |
End Date: | July 31, 2021 (Estimated) |
Total Intended Award Amount: | $200,000.00 |
Total Awarded Amount to Date: | $200,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
2550 NORTHWESTERN AVE # 1100 WEST LAFAYETTE IN US 47906-1332 (765)494-1055 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
465 Northwestern Ave. West Lafayette IN US 47907-2035 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | PPoSS-PP of Scalable Systems |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Title: SPX: Collaborative Research: Multi-Grain Compilers for Parallel Builds at Every Scale
Modern software development practices at companies such as Google and Facebook have led to compilation -- the process of transforming source programs into executable programs -- becoming a significant, time-consuming, resource-intensive process. Unfortunately, even state of the art compilers and build systems do not do a good job of exploiting emerging, high-performance, highly-parallel hardware, so software development is hampered by the still-slow process of compilation. This project aims to develop new techniques to speed up the process of compilation. The intellectual merits are designing new compiler internals, algorithms, and schedulers to enable compilers to take advantage of modern hardware capabilities. The project's broader significance and importance are that the process of compilation undergirds virtually every aspect of modern software, and hence modern life: speeding up compilation enables any type of software to be developed more quickly, providing new features to users and more quickly squashing potentially catastrophic bugs.
The project revolves around three main thrusts. First, the PIs are developing new representations for compiler internals that better fit the memory hierarchy of modern machines, eschewing pointer-based representations for dense representations. We are designing techniques to allow programmers to write their compiler passes at a high level while automatically converting them to use the dense representation. Second, the PIs are designing new algorithms to optimize compiler passes. These are transformations of internal compiler algorithms to promote locality (by combining passes that operate on similar portions of a program) and to enhance parallelism (by eliminating unnecessary synchronization between passes). Finally, the PIs are creating new scheduling techniques to allow the new highly-parallel compiler algorithms to be effectively mapped to the parallel and distributed hardware on which modern build systems execute.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
In this project, we sought to improve software implementation techniques for a class of programs that traverse irregular, tree-shaped data. Computer software is written by humans and then translated by a compiler into machine readable code. This project targeted that compilation step and made several advances in compiling tree-traversing programs more efficiently.
First, our approach demonstrated that a dense, mostly-serialized representation of tree data can be used automatically by the compiler (ECOOP’17), and that this not only uses less memory, but makes traversals more than twice as fast. We expanded on the approach by formalizing the language that the compiler manipulates to accomplish this feat (PLDI’19A). We built on this advancement by also studying fusion optimizations that combine multiple, consecutive traversals of the tree data into a single traversal (PLDI’19B). We also showed that the approach scales from single-core processors to multi-core parallel processors (ICFP’21), still running more than twice as fast as any previously existing approach for compiling tree traversing programs.
In fact, the compiler itself is a prime example of tree-traversing software as the compiler traverses the abstract syntax tree of the input software (much like a sentence diagram in grammar class). In PLDI’19A, and ICFP’21, we were able to demonstrate that compilers themselves get much faster using our approach. That enables them to build software faster, which is a of increasing importance as software industry switches to “continuous delivery” with changes made by software developers being built, tested, and deployed as quickly as possible.
A full software build consists of many separate compiler invocations on many separate input files. To address the full picture -- including parallelism opportunities at multiple scales -- we not only studied the performance of the “innermost” tree traversals, but also the larger build systems that contain them. We built a new system for scheduling multiple jobs with dynamic levels of parallelism (HiPC’20), as well as a new system for automatically and correctly parallelizing build processes (OOPSLA‘20, CPP’22).
Broader impacts: We have made all these technologies available via our open source software, if they are gradually adopted by industry, we will see increased software development velocity, which in turn speeds up the other forms of advancement that are enabled by software.
Last Modified: 03/07/2022
Modified by: Milind Kulkarni
Please report errors in award information by writing to: awardsearch@nsf.gov.