Award Abstract # 1422133
SHF: Small: Secure Compilation of Advanced Languages

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: NORTHEASTERN UNIVERSITY
Initial Amendment Date: June 13, 2014
Latest Amendment Date: May 3, 2017
Award Number: 1422133
Award Instrument: Standard Grant
Program Manager: Anindya Banerjee
abanerje@nsf.gov
 (703)292-7885
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2014
End Date: July 31, 2017 (Estimated)
Total Intended Award Amount: $499,812.00
Total Awarded Amount to Date: $507,812.00
Funds Obligated to Date: FY 2014 = $499,812.00
FY 2017 = $8,000.00
History of Investigator:
  • Amal Ahmed (Principal Investigator)
    amal@ccs.neu.edu
Recipient Sponsored Research Office: Northeastern University
360 HUNTINGTON AVE
BOSTON
MA  US  02115-5005
(617)373-5600
Sponsor Congressional District: 07
Primary Place of Performance: Northeastern University
360 Huntington Ave
Boston
MA  US  02115-5005
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): HLTMVS2JZBS6
Parent UEI:
NSF Program(s): Software & Hardware Foundation
Primary Program Source: 01001415DB NSF RESEARCH & RELATED ACTIVIT
01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923, 7943, 9251
Program Element Code(s): 779800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Title: SHF: Small: Secure Compilation of Advanced Languages

Advanced programming languages, based on dependent types, enable program verification alongside program development, thus making them an ideal tool for building fully verified, high assurance software. Recent dependently typed languages that permit reasoning about state and effects---such as Hoare Type Theory (HTT) and Microsoft's F*---are particularly promising and have been used to verify a range of rich security policies, from state-dependent information flow and access control to conditional declassification and information erasure. But while these languages provide the means to verify security and correctness of high-level source programs, what is ultimately needed is a guarantee that the same properties hold of compiled low-level target code. Unfortunately, even when compilers for such advanced languages exist, they come with no formal guarantee of correct compilation, let alone any guarantee of secure compilation---i.e., that compiled components will remain as secure as their high-level counterparts when executed within arbitrary low-level contexts. This project seeks to demonstrate how to build realistic yet secure compilers. This is a notoriously difficult problem. On one hand, a secure compiler must ensure that low-level contexts cannot launch any "attacks" on the compiled component that would have been impossible to launch in the high-level language. On the other hand, a realistic compiler cannot simply limit the expressiveness of the low-level target language to achieve the security goal.

The intellectual merit of this project is the development of a powerful new proof architecture for realistic yet secure compilation of dependently typed languages that relies on contracts to ensure that target-level contexts respect source-level security guarantees and leverages these contracts in a formal model of how source and target code may interoperate. The broader impact is that this research will make it possible to compose high-assurance software components into high-assurance software systems, regardless of whether the components are developed in a high-level programming language or directly in assembly. Compositionality has been a long-standing open problem for certifying systems for high-assurance. Hence, this research has potential for enormous impact on how high-assurance systems are built and certified. The specific goal of the project is to develop a verified multi-pass compiler from Hoare Type Theory to assembly that is type preserving, correct, and secure. The compiler will include passes that perform closure conversion, heap allocation, and code generation. To prove correct compilation of components, not just whole programs, this work will use an approach based on defining a formal semantics of interoperability between source components and target code. To guarantee secure compilation, the project will use (static) contract checking to ensure that compiled code is only run in target contexts that respect source-level security guarantees. To carry out proofs of compiler correctness, the project will develop a logical relations proof method for Hoare Type Theory.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Daniel Patterson and Amal Ahmed. "Linking Types for Multi-Language Software: Have Your Cake and Eat it Too." 2nd Summit on Advances in Programming Languages (SNAPL) 2017 , 2017 10.4230/LIPIcs.SNAPL.2017.12
Daniel Patterson, Jamie Perconti, Christos Dimoulas and Amal Ahmed "FunTAL: Reasonably Mixing a Functional Language with Assembly." ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) 2017. , 2017 10.1145/3062341.3062347
Max S. New, William J. Bowman, and Amal Ahmed "Fully Abstract Compilation via Universal Embedding" ACM SIGPLAN International Conference on Functional Programming (ICFP'16) , 2016
Patterson, Daniel and Ahmed, Amal "The next 700 compiler correctness theorems (functional pearl)" Proceedings of the ACM on Programming Languages , v.3 , 2019 10.1145/3341689 Citation Details
William J. Bowman and Amal Ahmed "Noninterference for Free" In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP '15), Vancouver, Canada, September 2015. , 2015
William J. Bowman, Youyou Cong, Nick Rioux, and Amal Ahmed. "Type-Preserving CPS Translation of Sigma and Pi Types is Not Not Possible." ACM SIGPLAN Symposium on Principles of Programming Languages (POPL) 2018. Published in journal PACMPL. , v.2 , 2018 10.1145/3158110

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Programming languages with advanced type systems facilitate the development of reliable software by giving programmers the ability to express numerous safety and security properties of their programs via types.  Violations of these properties can be caught via type checking at compile time, rather than at run time. Languages like OCaml and F# have types that provide memory safety and information hiding guarantees, while dependently typed languages such as that of the Coq proof assistant have richer types that enable verification of full functional correctness during program development.  In recent years, numerous projects have used Coq to build fully verified software fromOS kernels to compilers to cryptographic protocols.

While advanced typed languages are a powerful tool for verifying safety, security, and correctness of source programs, what is ultimately needed is a guarantee that the same properties hold of compiled low-level code.  Unfortunately, compilers for such advanced languages come with no formal guarantee of correct compilation in the presence of linking with other code, let alone any guarantee of secure compilation---i.e., that compiled components will remain as secure as their source-level counterparts when linked with arbitrary target-level code.  This is a concern because compiled code is routinely linked with libraries implemented directly in the target language or compiled from other source languages. Formally, a correct compiler is one that preserves the semantics of all source code it compiles, while a secure compiler is one that is fully abstract: it guarantees that two compiled components can be distinguished by some target-level code if and only if their source-level counterparts can be distinguished by some source-level code. Thus, secure compilers have the added benefit of allowing programmers to reason about their programs (such as correctness of refactoring) by *thinking* only in their source language.  

This project has developed methods for building realistic yet secure (fully abstract) compilers for advanced typed languages, a notoriously difficult problem. Fully abstract compilation has historically been achieved by either making the source language more expressive or making the target language less expressive. Both approaches are infeasible in the context of real compilers whose target languages are usually more expressive than the source, allowing target-level attackers to make observations about compiled components that source-level attackers cannot make about the original source component---e.g., via disruptions of control flow, or inspection of information private to a module.

A central goal was to demonstrate that source-language types can be translated/compiled to target-language types (static contracts) that can be used to ensure that target-level contexts respect source-level security guarantees.  The use of static rather than dynamic contracts for ensuring security is critical as it allows us to avoid the significant performance overheads that accompany dynamic enforcement of security guarantees.

The first major result of our project is the development of correct and *typed* compiler passes for the core language of the Coq proof assistant. These include typed CPS translation, which since 2002 was conjectured to be impossible, and typed closure conversion that is compatible with critical features of Coq.  These results are key to building a practical verified compiler for Coq which would ensure that fully verified source code cannot be compromised via compiler bugs or unsafe linking.

The second major outcome are proof techniques for verifying that acompiler is fully abstract.  The project devised "backtranslation" techniques that demonstrate that target-level attackers can be modeled at source level, even when the target language contains features that are absent from the source.  Different classes of proof techniques were developed for terminating languages (back-translation by partial evaluation) and for Turing-complete languages (back-translation via universal embedding).  

The third major result is development of interoperability semantics between high-level (direct-style) languages and a low-level assembly, that supports embedding assembly in high-level programs, and vice versa, while ensuring safety of the mixed program.  The key challenge was developing a notion of "components" in assembly such that components can be made up of different number of basic blocks and can be nested. This should inform design of other high/low multi-languages, e.g., for verification of C code mixed with assembly and just-in-time (JIT) compilers.

Finally, the project noted the tension between secure compilation--which must disallow linking with code whose behavior cannot be expressed in the compiler's source language--and the building of useful multi-language software--which frequently demands such linking.  The project advocated that language designs be extended to facilitate development of compilers and toolchains that support building multi-language software.  Specifically, every language should come with *linking types* extensions that provide programmers with the means to annotate their source code to express the points where they wish to link with code behaviors that cannot be expressed in their language.  Giving programmers high-level means to control low-level linking seems essential for building practical secure compilers and toolchains that report cross-language type errors to aid in the development of multi-language software.


Last Modified: 04/19/2018
Modified by: Amal Ahmed

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page