Award Abstract # 2305491
CAREER: Automated and Efficient Machine Learning as a Service

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF HOUSTON SYSTEM
Initial Amendment Date: May 12, 2023
Latest Amendment Date: September 19, 2024
Award Number: 2305491
Award Instrument: Continuing Grant
Program Manager: Daniel Andresen
dandrese@nsf.gov
 (703)292-2177
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2022
End Date: June 30, 2026 (Estimated)
Total Intended Award Amount: $517,459.00
Total Awarded Amount to Date: $505,950.00
Funds Obligated to Date: FY 2021 = $189,601.00
FY 2023 = $103,921.00

FY 2024 = $212,428.00
History of Investigator:
  • Feng Yan (Principal Investigator)
    fyan5@central.uh.edu
Recipient Sponsored Research Office: University of Houston
4300 MARTIN LUTHER KING BLVD
HOUSTON
TX  US  77204-3067
(713)743-5773
Sponsor Congressional District: 18
Primary Place of Performance: University of Houston
4800 W CALHOUN ST STE 316
HOUSTON
TX  US  77204-3067
Primary Place of Performance
Congressional District:
18
Unique Entity Identifier (UEI): QKWEF8XLMTT3
Parent UEI:
NSF Program(s): CSR-Computer Systems Research
Primary Program Source: 01002122DB NSF RESEARCH & RELATED ACTIVIT
01002324DB NSF RESEARCH & RELATED ACTIVIT

01002425DB NSF RESEARCH & RELATED ACTIVIT

01002526DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 9150, 1045
Program Element Code(s): 735400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Machine-Learning-as-a-Service (MLaaS) is an emerging computing paradigm that provides optimized execution of machine learning tasks, such as model design, model training, and model serving, on cloud infrastructure. Explosive growth in model complexity and data size along with the surging demands of MLaaS is already resulting in substantial increases in computational resource and energy requirements. Unfortunately, existing MLaaS systems have poor resource management and limited support for user specified performance and cost requirements, exacerbating waste in computing resources and energy. This project aims to utilize the unique features of MLaaS to design efficient, automated, and user-centric MLaaS systems. This approach will significantly reduce resource waste and shorten the model design cycles through a variety of novel optimization approaches and by eliminating candidate models that fail to meet model serving latency and target accuracy. To support complete MLaaS workflow, this project will also develop MLaaS model serving methodologies that can meet service level latency requirements with minimum resource consumption using intelligent autoscaling.

This project has the potential to tremendously reduce the resource and energy consumptions as well as the carbon footprint associated with the fast-growing societal demands in machine learning and cloud computing. Important insights and technologies will be produced targeting resource management and energy saving of the next-generation machine learning systems and cloud infrastructure. The findings of this project will also contribute to related fields of parallel and distributed systems, performance evaluation and optimization, and green computing. This project will carry out substantial integrated education activities including new course and online education development, integration of industry feedback in education. Additionally, the work will impact undergraduate and graduate students by training them in the art of system optimization combined with the latest machine learning domain knowledge while combining outreach and engagement of students from underrepresented groups and especially women.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 15)
Ali, Ahsan and Ma, Xiaolong and Zawad, Syed and Aditya, Paarijaat and Akkus, Istemi Ekin and Chen, Ruichuan and Yang, Lei and Yan, Feng "Enabling scalable and adaptive machine learning training via serverless computing on public cloud" Performance Evaluation , v.167 , 2025 https://doi.org/10.1016/j.peva.2024.102451 Citation Details
Mahmoud_Sajjadi_Mohammadabadi, Seyed and Zawad, Syed and Yan, Feng and Yang, Lei "Speed Up Federated Learning in Heterogeneous Environments: A Dynamic Tiering Approach" IEEE Internet of Things Journal , v.12 , 2025 https://doi.org/10.1109/JIOT.2024.3487473 Citation Details
Ma, Kai and Li, Cheng and Zhu, Enzuo and Chen, Ruichuan and Yan, Feng and Chen, Kang "Noctua: Towards Practical and Automated Fine-grained Consistency Analysis" European Conference on Computer Systems , 2024 Citation Details
Ma, Xiaolong and Yan, Feng and Yang, Lei and Foster, Ian and Papka, Michael and Liu, Zhengchun and Kettimuthu, Rajkumar "MalleTrain: Deep Neural Networks Training on Unfillable Supercomputer Nodes" International Conference on Performance Engineering , 2024 Citation Details
Sajjadi_Mohammadabadi, Seyed Mahmoud and Yang, Lei and Yan, Feng and Zhang, Junshan "Communication-Efficient Training Workload Balancing for Decentralized Multi-Agent Learning" , 2024 https://doi.org/10.1109/ICDCS60910.2024.00069 Citation Details
Shao, Xinyang and Wang, Yiduo and Li, Cheng and Liang, Hengyu and Wang, Chenhan and Yan, Feng and Xu, Yinlong "Towards Agile and Judicious Metadata Load Balancing for Ceph File System via Matrix-based Modeling" ACM Transactions on Storage , 2025 https://doi.org/10.1145/3721483 Citation Details
Tuli, Shreshth and Mirhakimi, Fatemeh and Pallewatta, Samodha and Zawad, Syed and Casale, Giuliano and Javadi, Bahman and Yan, Feng and Buyya, Rajkumar and Jennings, Nicholas R. "AI augmented Edge and Fog computing: Trends and challenges" Journal of Network and Computer Applications , v.216 , 2023 https://doi.org/10.1016/j.jnca.2023.103648 Citation Details
Wang, Guanhua and Qin, Heyang and Jacobs, Sam and Wu, Xiaoxia and Holmes, Connor and Yao, Zhewei and Rajbhandari, Samyam and Ruwase, Olatunji and Yan, Feng and Yang, Lei and He, Yuxiong "ZeRO++: Extremely Efficient Collective Communication for Large Model Training" International Conference on Learning Representations , 2024 Citation Details
Wang, Xinying and Wan, Lipeng and Klasky, Scott and Zhao, Dongfang and Yan, Feng "SciLance: Mitigate Load Imbalance for Parallel Scientific Applications in Cloud Environments" IEEE International Conference on Cluster Computing , 2023 https://doi.org/10.1109/CLUSTER52292.2023.00012 Citation Details
Wu, Hao and Wang, Shiyi and Bai, Youhui and Li, Cheng and Zhou, Quan and Yi, Jun and Yan, Feng and Chen, Ruichuan and Xu, Yinlong "A Generic, High-Performance, Compression-Aware Framework for Data Parallel DNN Training" IEEE Transactions on Parallel and Distributed Systems , 2024 https://doi.org/10.1109/TPDS.2023.3266246 Citation Details
Zawad, Syed and Ma, Xiaolong and Yi, Jun and Li, Cheng and Zhang, Minjia and Yang, Lei and Yan, Feng and He, Yuxiong "FedCust: Offloading hyperparameter customization for federated learning" Performance Evaluation , v.167 , 2025 https://doi.org/10.1016/j.peva.2024.102450 Citation Details
(Showing: 1 - 10 of 15)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page