Award Abstract # 2404373
III: Small: Strategically Transforming Code Across SQL and User-Defined Function Boundaries to Enable Effective Optimizations

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: CARNEGIE MELLON UNIVERSITY
Initial Amendment Date: August 26, 2024
Latest Amendment Date: August 26, 2024
Award Number: 2404373
Award Instrument: Standard Grant
Program Manager: Sorin Draghici
sdraghic@nsf.gov
 (703)292-2232
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2024
End Date: September 30, 2027 (Estimated)
Total Intended Award Amount: $599,639.00
Total Awarded Amount to Date: $599,639.00
Funds Obligated to Date: FY 2024 = $599,639.00
History of Investigator:
  • Andrew Pavlo (Principal Investigator)
    pavlo@sydht.ai
  • Todd Mowry (Co-Principal Investigator)
Recipient Sponsored Research Office: Carnegie-Mellon University
5000 FORBES AVE
PITTSBURGH
PA  US  15213-3815
(412)268-8746
Sponsor Congressional District: 12
Primary Place of Performance: Carnegie-Mellon University
5000 FORBES AVE
PITTSBURGH
PA  US  15213-3815
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): U3NKNFLNQ613
Parent UEI: U3NKNFLNQ613
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01002425DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923, 7364
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Modern software applications in all facets of society, including commercial, scientific, and non-profit enterprises, rely on databases to store information. These organizations often want to use their data in ways they cannot easily express with existing database query languages, especially in the context of artificial intelligence and data science applications. This mismatch means such applications wait longer for answers about their data, inhibiting them from reacting to changes as quickly as possible and impeding their goals. This research addresses this problem and develops foundational techniques that automatically removes such inefficiencies without requiring organizations to perform costly rewrites of their application code. It enables organizations to ask more complex questions about their data and extrapolate new knowledge from it, all while using less computing and energy resources than today?s systems.

Many database management systems (DBMSs) extend the query language SQL to support user-defined functions (UDFs) written in procedural programming languages. Despite their software engineering advantages, UDFs are notoriously difficult to optimize within database systems, and DBMSs often resort to executing them iteratively (row-by-row). This project focuses on developing optimization approaches to overcome SQL and UDF boundaries via automatic code transformations that pass critical information between them to enable more effective query planning and compilation. These strategies include methods for programmatically deconstructing UDFs into smaller pieces, manipulating them individually, and reconstructing them into the calling query to optimize performance. This project will address three fundamental research challenges: (1) improving the performance of UDFs without requiring modifications to the application code, (2) optimizing external language UDFs (e.g., Python) that rely on dynamic types and library calls, and (3) generating new optimizations that leverage information about UDFs across the entire lifecycle of a query and multiple invocations. By eliminating performance penalties associated with UDFs, this research will enable organizations to improve the efficiency of applications and support more complex workloads, including leveraging machine learning and data science libraries.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page