Award Abstract # 1360694
CAREER: Bridging the Gap Between Prototyping and Production

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: TRUSTEES OF INDIANA UNIVERSITY
Initial Amendment Date: September 26, 2013
Latest Amendment Date: September 26, 2013
Award Number: 1360694
Award Instrument: Continuing Grant
Program Manager: Sol Greenspan
sgreensp@nsf.gov
 (703)292-7841
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2013
End Date: February 29, 2016 (Estimated)
Total Intended Award Amount: $407,022.00
Total Awarded Amount to Date: $407,022.00
Funds Obligated to Date: FY 2011 = $2,489.00
FY 2012 = $210,376.00

FY 2013 = $194,157.00
History of Investigator:
  • Jeremy Siek (Principal Investigator)
    jsiek@indiana.edu
Recipient Sponsored Research Office: Indiana University
107 S INDIANA AVE
BLOOMINGTON
IN  US  47405-7000
(317)278-3473
Sponsor Congressional District: 09
Primary Place of Performance: Indiana University
509 E 3RD ST
Bloomington
IN  US  47401-3654
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): YH86RTW2YVJ4
Parent UEI:
NSF Program(s): Software & Hardware Foundation
Primary Program Source: 01001112RB NSF RESEARCH & RELATED ACTIVIT
01001213DB NSF RESEARCH & RELATED ACTIVIT

01001213RB NSF RESEARCH & RELATED ACTIVIT

01001314DB NSF RESEARCH & RELATED ACTIVIT

01001314RB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 1187, 7798, 7943, 9218, HPCC
Program Element Code(s): 779800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Modern software engineering methods improve programmer productivity by taking an incremental approach to software development. Software engineers rapidly develop prototypes and then iteratively refine the prototypes into production systems. However, today's programming systems do not support a smooth transition from prototyping to production. On one hand, scripting languages and interactive environments support prototyping while on the other hand conventional programming languages and optimizing compilers support the development of reusable and efficient production codes. Neither support both prototyping and production, so developers use a mixture of programming systems. This practice incurs many costs such as the impedance mismatch of inter-language data transfers and the time to translate programs between languages.

The goal of this research is to discover the scientific principles necessary for a single programming system to effectively support the incremental refinement of prototypes into production software. To accomplish this research objective, classic conflicts between flexibility and safety and between abstraction and performance need to be resolved. To achieve both flexibility and safety, the research will investigate ways to combine dynamic and static type checking, using an approach called gradual typing. To achieve both abstraction and performance, the research will develop a domain-specific compiler for linear algebra and show how show how high-level abstractions can provide greater opportunities for compiler optimization than conventional abstractions such as loops and scalar operations. The broader impacts of the project arise from improvements to programmer productivity and software quality.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 19)
Amal Ahmed and Robert Bruce Findler and Jeremy G. Siek and Philip Wadler "Blame for {All}" Symposium on Principles of Programming Languages , 2011 10.1145/1926385.1926409
Carl Friedrich Bolz and Tobias Pape and Sam Tobin-Hochstadt and Jeremy G. Siek "Meta-tracing makes a fast {Racket}" Workshop on Dynamic Languages and Applications , 2014
Elizabeth R. Jessup and Ian Karlin and Erik Silkensen and Geoffrey Belter and Jeremy Siek "Understanding memory effects in the automated generation of optimized matrix algebra kernels" Automated Program Generation Workshop at ICCS , 2010 10.1016/j.procs.2010.04.209
Geoffrey Belter and E. R. Jessup and Ian Karlin and Jeremy G. Siek "Automating the Generation of Composed Linear Algebra Kernels" International Conference for High Performance Computing, Networking, Storage and Analysis , 2009 10.1145/1654059.1654119
Geoffrey Belter and Jeremy G. Siek and Ian Karlin and E. R. Jessup "Automatic Generation of Tiled and Parallel Linear Algebra Routines" Fifth International Workshop on Automatic Performance Tuning (iWAPT) , 2010
Jeremy G. Siek and Ian Karlin and E. R. Jessup "Build to Order Linear Algebra Kernels" Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL 2008) , 2008 , p.1-8
Jeremy G. Siek and Manish Vachharajani "Gradual Typing and Unification-based Inference" Dynamic Languages Symposium , 2008 10.1145/1408681.1408688
Jeremy G. Siek and Michael M. Vitousek and Matteo Cimini and John Tang Boyland "Refined Criteria for Gradual Typing" {SNAPL}: Summit on Advances in Programming Languages , 2015 10.4230/LIPIcs.SNAPL.2015.274
Jeremy G. Siek and Michael M. Vitousek and Matteo Cimini and Sam Tobin-Hochstadt and Ronald Garcia "Monotonic References for Efficient Gradual Typing" ESOP: European Symposium on Programming , 2015 10.1007/978-3-662-46669-8_18
Jeremy G. Siek and Peter Thiemann and Philip Wadler "Blame and coercion: Together again for the first time" PLDI: Conference on Programming Language Design and Implementation , 2015 10.1145/2737924.2737968
Jeremy G. Siek and Ronald Garcia "Interpretations of the Gradually-Typed Lambda Calculus" Workshop on Scheme and Functional Programming , 2012 10.1145/2661103.2661112
(Showing: 1 - 10 of 19)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Incremental development is an essential feature of modern software
engineering and greatly improves programmer productivity. Software
teams rapidly develop prototypes and then iteratively refine the
prototypes into production systems. However, today’s programming
systems do not support a smooth transition from prototyping to
production. On one hand, dynamic languages such as MATLAB and Python
support prototyping while on the other hand static languages such as
C++ and Java support the development of reusable and efficient
production codes. Neither support both prototyping and production, so
developers use a mixture of programming systems. This practice incurs
many costs such as the impedance mismatches of inter-language data
transfers and the time to translate programs between languages.

The goal of the project "CAREER: Bridging the Gap Between Prototyping
and Production" was to make progress towards having a single
programming system that would support a gradual transition from
prototyping to production. To accomplish the research objective, the
long-standing conflicts between flexibility and safety and between
abstraction and performance need to be resolved.  Regarding flexibilty
versus safety, the project investigated the idea of gradual typing,
which combines both static and dynamic type checking within the same
language. Regarding abstraction and performance, the project
investigated the compilation of a high-level language for matrix
algebra into high-performance code. We discuss the scientific
accomplishments on these two lines of research in the following
paragraphs.  But to summarize the outputs of the project, we published
18 papers including 9 at high-impact conferences and journals. We
graduated 2 Ph.D. students and 2 M.S. students and 3 more
Ph.D. students are in progress. The project also produced two software
artifacts that have been released to the public, a gradually typed
version of the Python programming language, and the Build-to-Order
Basic Linear Algebra Subroutines, about which we will say more below.

The project made significant discoveries regarding both the theory and
practice of gradual typing. On the theory side, we discovered how to
integrate gradual typing with parametric polymorphism (aka. generics)
in a way that preserves "parametricity", which is import for reasoning
about the correctness of polymorphic procedures.  We solved the space
complexity problem that arises from the use of proxies to ensure the
safety of statically typed code that interacts with dynamically typed
code. We also solved problems regarding runtime overhead in statically
typed code (not just in partially typed or dynamically typed
code). Finally, we developed formal criteria to better characterize
when a language design satisfies the intent of gradual typing.

Regarding the practice of gradual typing, we designed and implemented
a gradually typed variant of Python, named Reticulated Python.  We
encountered a number of challenges along the way and invented
solutions to them, including "transient casts", which guarantee the
safety of statically typed code without using proxies, thereby
avoiding interoperability problems that we encountered regarding plain
Python code and with code written in C (parts of the Python standard
library, and many extensions). Another significant challenge to
gradual typing is that the performance of partially typed code can
degrade significantly. To solve this problem, we have initiated two
projects. We are investigating whether just-in-time compilation can
improve performance by implementing the Racket language using the
RPython framework. In the second project, we are building a native
code ahead-of-time compiler that implements the theoretical advances
mentioned above.  So far, both the just-in-time and ahead-of-time
compilers are showing promise and delivering good performance on
benchmark programs.

The second line of research in this project was in the compilation of
a language for matrix algebra into high-performance parallel code.
The goal of our compiler was to produce the best-possible code for a
particular combination of matrix operations. For example, it compiles
the input program
    y := b * A' * (A * x)
into low-level C code that is organized to make best use of multiple
processors and the multiple levels of cache memory within a computer.
The code transformations to accomplish this are well known: loop
fusion, array contraction, and data parallel loops.  However, the the
techniques are not beneficial in every situation, so the challenge is
finding the the best combination of transformations for a given input
program and target computer architecture. Thus, the essense of the
problem is mathematical optimization and search. We evaluated a wide
variety of search techniques, including genetic algorithms and
hill-climbing, and we developed a new heuristic/greedy search
algorithm.  We found that the best approach was a hybrid that started
with our greedy search and then finished with a genetic search.  With
this approach we were able to generate code for a wide range of
real-world matrix computations that was just as efficient as code
produced by a human expert and that out-performed the best
general-purpose loop-optimizing compilers.


Last Modified: 11/17/2016
Modified by: Jeremy G Siek

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page