Award Abstract # 2004932
Collaborative Research: Frameworks: funcX: A Function Execution Service for Portability and Performance

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: UNIVERSITY OF ILLINOIS
Initial Amendment Date: April 17, 2020
Latest Amendment Date: April 17, 2020
Award Number: 2004932
Award Instrument: Standard Grant
Program Manager: Marlon Pierce
mpierce@nsf.gov
 (703)292-7743
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: May 1, 2020
End Date: April 30, 2025 (Estimated)
Total Intended Award Amount: $481,234.00
Total Awarded Amount to Date: $481,234.00
Funds Obligated to Date: FY 2020 = $481,234.00
History of Investigator:
  • Daniel Katz (Principal Investigator)
    dskatz@illinois.edu
Recipient Sponsored Research Office: University of Illinois at Urbana-Champaign
506 S WRIGHT ST
URBANA
IL  US  61801-3620
(217)333-2187
Sponsor Congressional District: 13
Primary Place of Performance: Board of Trustees of the University of Illinois
506 S Wright St
Urbana
IL  US  61801-3620
Primary Place of Performance
Congressional District:
13
Unique Entity Identifier (UEI): Y8CWNJRCNN91
Parent UEI: V2PHZ2CSCH63
NSF Program(s): Software Institutes
Primary Program Source: 01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7925, 077Z, 8004
Program Element Code(s): 800400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The funcX project is developing, deploying, and operating a new distributed computing cyberinfrastructure platform to enable researchers to build applications from programming functions that execute on different computing resources, from laptops to supercomputers. This cloud-hosted service democratizes access to advanced computing by providing intuitive interfaces for both registering remote computers as function executors and executing functions on these computers reliably, securely, and with high performance. Researchers can thus decompose monolithic applications into collections of reusable lightweight functions that can be run wherever makes the most sense, for example where data reside or where excess capacity is available. By simplifying access to specialized and high performance cyberinfrastructure and decreasing the time to discovery, the project serves the national interest, as stated in NSF's mission, by promoting the progress of science. A total of 33 diverse science, cyberinfrastructure, and software institute partners working with cutting-edge science applications and research cyberinfrastructure will directly benefit from the funcX platform.

This project develops funcX, a scalable and high-performance federated platform for managing the remote execution of (often short-duration) functions across diverse cyberinfrastructure systems, from edge accelerators to clusters, supercomputers, and clouds. funcX allows developers to decompose applications into collections of functions that can each be executed in the best location, in terms of cost, execution time, data movement costs, and/or energy consumption. It thus integrates the extreme convenience of the function as a service (FaaS) model, developed in industry for specific industry applications, with support for the specialized needs of scientific research. funcX addresses important barriers to these new uses of research cyberinfrastructure systems, by enabling the intuitive, flexible, and scalable execution of functions without regard to physical location, scheduler architecture, virtualization technology, administrative domain, or data location. Flexible open-source funcX agent software makes it easy to expose arbitrary computing systems as funcX computing platforms, thereby transforming existing cyberinfrastructure systems into high-performance function serving environments (endpoints). The cloud-hosted funcX service provides a REST interface for registering functions, discovering available endpoints, and managing the execution of functions on endpoints, all via a universal trust fabric and standard web authentication and authorization mechanisms. It dynamically creates and deploys containers that incorporate function dependencies and provide a secure and isolated environment for safe function execution. The project engages a diverse set of 11 science partners, 18 research computing and cyberinfrastructure projects, and 4 NSF Software Institutes, each supporting many NSF-funded researchers, to provide use cases for funcX, shape its design, and evaluate its implementation.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 22)
Ward, Logan and Pauloski, J. Gregory and Hayot-Sasson, Valerie and Chard, Ryan and Babuji, Yadu and Sivaraman, Ganesh and Choudhury, Sutanay and Chard, Kyle and Thakur, Rajeev and Foster, Ian "Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources" , 2023 https://doi.org/10.1109/IPDPSW59300.2023.00018 Citation Details
Vescovi, Rafael and Chard, Ryan and Saint, Nickolaus D. and Blaiszik, Ben and Pruyne, Jim and Bicer, Tekin and Lavens, Alex and Liu, Zhengchun and Papka, Michael E. and Narayanan, Suresh and Schwarz, Nicholas and Chard, Kyle and Foster, Ian T. "Linking scientific instruments and computation: Patterns, technologies, and experiences" Patterns , v.3 , 2022 https://doi.org/10.1016/j.patter.2022.100606 Citation Details
Ananthakrishnan, Rachana and Babuji, Yadu and Baughman, Matt and Bryan, Josh and Chard, Kyle and Chard, Ryan and Clifford, Ben and Foster, Ian and Katz, Daniel S and Hunter_Kesling, Kevin and Janidlo, Chris and Mello, Reid and Wang, Lei "Enabling Remote Management of FaaS Endpoints with Globus Compute Multi-User Endpoints" , 2024 https://doi.org/10.1145/3626203.3670612 Citation Details
Woodard, Anna Elizabeth and Trisovic, Ana and Li, Zhuozhao and Babuji, Yadu and Chard, Ryan and Skluzacek, Tyler and Blaiszik, Ben and Katz, Daniel S. and Foster, Ian and Chard, Kyle "Real-time HEP analysis with funcX, a high-performance platform for function as a service" EPJ Web of Conferences , v.245 , 2020 https://doi.org/10.1051/epjconf/202024507046 Citation Details
Li, Yifei and Chard, Ryan and Babuji, Yadu and Chard, Kyle and Foster, Ian and Li, Zhuozhao "UniFaaS: Programming across Distributed Cyberinfrastructure with Federated Function Serving" , 2024 https://doi.org/10.1109/IPDPS57955.2024.00027 Citation Details
Baughman, Matt and Hudson, Nathaniel and Foster, Ian and Chard, Kyle "Balancing Federated Learning Trade-Offs for Heterogeneous Environments" IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops) , 2023 https://doi.org/10.1109/PerComWorkshops56833.2023.10150228 Citation Details
Chard, Ryan and Babuji, Yadu and Li, Zhuozhao and Skluzacek, Tyler and Woodard, Anna and Blaiszik, Ben and Foster, Ian and Chard, Kyle "funcX: A Federated Function Serving Fabric for Science" Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing , 2020 https://doi.org/10.1145/3369583.3392683 Citation Details
Dhakal, Aditya and Raith, Philipp and Ward, Logan and Hong Enriquez, Rolando P. and Rattihalli, Gourav and Chard, Kyle and Foster, Ian and Milojicic, Dejan "Fine-grained accelerator partitioning for Machine Learning and Scientific Computing in Function as a Service Platform" , 2023 https://doi.org/10.1145/3624062.3624238 Citation Details
Kotsehub, Nikita and Baughman, Matt and Chard, Ryan and Hudson, Nathaniel and Patros, Panos and Rana, Omer and Foster, Ian and Chard, Kyle "FLoX: Federated Learning with FaaS at the Edge" 18th International Conference on e-Science (e-Science) , 2022 https://doi.org/10.1109/eScience55777.2022.00016 Citation Details
Kumar, Rohan and Baughman, Matt and Chard, Ryan and Li, Zhuozhao and Babuji, Yadu and Foster, Ian and Chard, Kyle "Coding the Computing Continuum: Fluid Function Execution in Heterogeneous Computing Environments" 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) , 2021 https://doi.org/10.1109/IPDPSW52791.2021.00018 Citation Details
Bauer, André and Gonthier, Maxime and Pan, Haochen and Chard, Ryan and Grzenda, Daniel and Straesser, Martin and Pauloski, J Gregory and Kamatar, Alok and Baughman, Matt and Hudson, Nathaniel and Foster, Ian and Chard, Kyle "An Empirical Investigation of Container Building Strategies and Warm Times to Reduce Cold Starts in Scientific Computing Serverless Functions" , 2024 https://doi.org/10.1109/e-Science62913.2024.10678668 Citation Details
(Showing: 1 - 10 of 22)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project has developed, deployed, and hosted a federated function-as-a-service (FaaS) platform designed to address the need to reliably, securely, and scalably execute scientific workloads on remote and heterogeneous computing systems. This platform was initially developed as research under the name funcX, planned to be transitioned into an operational service as part of Globus once proven, and as such, is now named Globus Compute. Globus Compute is implemented as a cloud-hosted service, providing an intuitive interface for researchers to register and share functions, register and manage remote computing endpoints, and execute registered functions on endpoints. Globus Compute allows authorized users to register functions, such as F(x), optionally supplying a list of dependencies so that F can be deployed in a containerized environment on different computers; authorized users can then use a simple Python SDK to invoke F(x) from within their applications on any computer where F is deployed.

Globus Compute functions can represent various types of workloads, from single tasks run interactively to enormous numbers of tasks run programmatically, leveraged by applications, shared with others, or linked together into complex workflows. In each case, Globus Compute manages the complexity associated with remote execution, for example, by leveraging web-standard security protocols for secure execution, providing a common interface across heterogeneous computing systems (e.g., schedulers, hardware architectures, container technology), reliably executing workloads even when there are errors or systems are offline, managing execution environments by dynamically creating and deploying containers, elastically scaling resources in response to workload, and, where permitted, sharing access to functions and computing resources.

Overall, this project has developed Globus Compute, which has defined a new serverless computing model for federated computing systems, adapting the powerful FaaS model from one in which centralized cloud platforms underlie the platform to one in which the compute infrastructure is distributed across a heterogeneous set of computing resources. The service has cumulatively supported more than 1300 users who deployed more than 1.3m endpoints and executed more than 54m functions. 

This model creates new opportunities for innovation in distributed systems, enabling functions to be easily (and even dynamically) routed to different endpoints based on real-time needs, for example where resources are available, or where users have allocations, or where data is located. The project has worked with computer scientists around the country and world to explore use of this model in external systems and to improve performance of the Globus Compute system. The project team provided 21 tutorials and webinars reaching more than 650 participants. They also published 22 journal, conference, and workshop papers. More than 350 students participated in courses using funcX/Globus Compute, and 2-4 high school and undergraduate students participated in the project.

 


Last Modified: 06/11/2025
Modified by: Daniel S Katz

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page