Skip to feedback

Award Abstract # 1912608
FoMR: Speculative Super-optimization: Boosting Performance via Speculation-Driven Dynamic Binary Optimization

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: RECTOR & VISITORS OF THE UNIVERSITY OF VIRGINIA
Initial Amendment Date: August 14, 2019
Latest Amendment Date: June 16, 2020
Award Number: 1912608
Award Instrument: Standard Grant
Program Manager: Danella Zhao
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2019
End Date: September 30, 2023 (Estimated)
Total Intended Award Amount: $200,000.00
Total Awarded Amount to Date: $216,000.00
Funds Obligated to Date: FY 2019 = $200,000.00
FY 2020 = $16,000.00
History of Investigator:
  • Ashish Venkat (Principal Investigator)
    av6ds@virginia.edu
  • Kevin Skadron (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Virginia Main Campus
1001 EMMET ST N
CHARLOTTESVILLE
VA  US  22903-4833
(434)924-4270
Sponsor Congressional District: 05
Primary Place of Performance: University of Virginia Main Campus
PO Box 400740
Charlottesville
VA  US  22904-4195
Primary Place of Performance
Congressional District:
05
Unique Entity Identifier (UEI): JJG6HU8PA4S5
Parent UEI:
NSF Program(s): Special Projects - CCF,
Software & Hardware Foundation
Primary Program Source: 01001920DB NSF RESEARCH & RELATED ACTIVIT
01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 021Z, 7798, 7941, 8585, 9251
Program Element Code(s): 287800, 779800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Modern processors are characterized by increasing 'core' counts on a single multi-core processor, where a 'core' is a computing unit, allowing each core to operate in parallel to, and independently of, other cores on the processor. Yet, a substantial chunk of software applications is inherently sequential. Although modern compilers feature sophisticated optimizations, significant waste of computational resources across these multiple-cores still occurs due to computational patterns that are unpredictable at compile-time. This project explores techniques to deploy aggressive speculative dynamic optimizations within the micro-processor, enabling continuous optimization of inherently sequential code. This work addresses a pressing need for systems that can aggressively and yet seamlessly super-optimize machine code, adapting to the dynamic execution environment, and thereby speed up execution of applications on computers. This research will also foster existing efforts and initiatives of the investigators to build a pipeline of students from diverse backgrounds.

This project explores dynamic binary optimization techniques at the processor level to speculatively generate and execute a super-optimized instruction stream, by leveraging established speculative processor features such as branch prediction, value prediction, and loop stream detection. This project introduces two distinct flavors of speculative super-optimization: (a) a hardware implementation that leverages dynamically predicted program state to perform simple, yet powerful peephole optimizations on short instruction sequences, and (b) a firmware implementation that leverages a microcode-based dynamic re-compiler running as a helper thread to perform more sophisticated optimizations. Owing to the plethora of speculative processor features and compiler/runtime/hardware optimizations to choose from, this project explores a rich sample space of speculative super-optimization strategies, along with extensive profitability analysis to dynamically identify appropriate targets for super-optimization.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Moody, Logan and Qi, Wei and Sharifi, Abdolrasoul and Berry, Layne and Rudek, Joey and Gaur, Jayesh and Parkhurst, Jeff and Subramoney, Sreenivas and Skadron, Kevin and Venkat, Ashish "Speculative Code Compaction: Eliminating Dead Code via Speculative Microcode Transformations" 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO) , 2022 https://doi.org/10.1109/MICRO56248.2022.00024 Citation Details
Ren, Xida and Moody, Logan and Taram, Mohammadkazem and Jordan, Matthew and Tullsen, Dean M. and Venkat, Ashish "I See Dead µops: Leaking Secrets via Intel/AMD Micro-Op Caches" 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) , 2021 https://doi.org/10.1109/ISCA52012.2021.00036 Citation Details
Sharifi, Rasool and Venkat, Ashish "CHEx86: Context-Sensitive Enforcement of Memory Safety via Microcode-Enabled Capabilities" 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) , 2020 https://doi.org/10.1109/ISCA45697.2020.00068 Citation Details
Taram, Mohammadkazem and Ren, Xida and Venkat, Ashish and Tullsen, Dean "SecSMT: Securing SMT Processors against Contention-Based Covert Channels" Proceedings of the 31st USENIX Security Symposium , 2022 Citation Details
Wu, Lingxi and Sharifi, Rasool and Lenjani, Marzieh and Skadron, Kevin and Venkat, Ashish "Sieve: Scalable In-situ DRAM-based Accelerator Designs for Massively Parallel k-mer Matching" 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) , 2021 https://doi.org/10.1109/ISCA52012.2021.00028 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Modern workloads feature data-dependent computation and control flow patterns that often vary considerably with changing datasets and configuration parameters. This has greatly reduced the effectiveness of conventional compiler optimizations that rely on hoisting program invariants identifiable at compile-time to eliminate redundant computation and further apply machine-specific optimizations to the residual code.  Run-time optimizations have largely been profile-guided and conservative in nature, limiting their ability to adapt to changing execution profiles as datasets evolve, notwithstanding high profiling overheads.

The project makes the key observation that modern datasets offer remarkable predictability, because data access and control flow patterns in many workload-dataset pairs manifest as invariants across long execution intervals, unmasking an untapped pocket of opportunity for speculatively optimizing such code paths at run-time, by exposing and eliminating dead code, entirely within the processor, based on dynamically predicted data and control invariants, enabling the continuous optimization of inherently sequential code, without the need for profiling, compiler metadata, or other source-level information.

Speculative Superoptimization is a minimally-invasive, prediction-driven microarchitectural technique that advances state-of-the-art dynamic binary optimization schemes in several important ways. First, it leverages the rich contextual knowledge available from a wide array of in-processor speculation techniques such as branch prediction, address prediction, and value prediction to track deterministic computational patterns in hot code regions and further transform an instruction sequence in execution into a compact stream of speculatively optimized instructions.  Second, it opens the door to the deployment of aggressive and speculative optimization techniques that have been traditionally deemed unsafe, thanks to the processor's ability to rollback execution to a safe point in the event of a misprediction by flushing speculatively optimized instructions and redirecting execution to the corresponding stream of unoptimized instructions that are co-hosted in the processor's micro-op cache.  Third, by exploiting predictability as the basis for optimization, it offers seamless adaptability to changing datasets and workload patterns, unlike existing runtime optimizations employed by software giants such as Google, Microsoft, and Meta that either require extensive profiling or tend to make conservative assumptions when they lack adequate context about the dynamic execution behavior.

The results from this work have been disseminated through top-tier conference and journal publications, such as MICRO, ISCA, and USENIX Security, and through tech talks at industry and academic institutions alike.  In addition, the software artifacts for this work have been made available through Github, for greater community use.  Notably, the simulator release includes a faithful implementation of a modern x86 front-end (including the micro-op cache and micro-op fusion) along with new speculative features such as value prediction and loop stream detection, providing researchers a strong foundation to build on.

The PIs also regularly teach graduate and undergraduate Computer Architecture courses with a significant emphasis on quantitative methods and performance analysis.  The PIs also offer specialization courses on hardware security and acceleration.

The PIs are also extremely committed towards diversity efforts in Computer Science and other STEM disciplines.  They provided research mentorship to three graduate students from India, China, Iran, and the US, in addition to multiple undergraduate students, one of whom is a female student who was recognized nationally by CRA's outstanding undergraduate researcher award honorable mention.

 



Last Modified: 04/29/2024
Modified by: Ashish Venkat

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page