NSF Award Search: Award # 2144316

Award Abstract # 2144316

CAREER: Lyapunov Drift Methods for Stochastic Recursions: Applications in Cloud Computing and Reinforcement Learning

NSF Org:	ECCS Division of Electrical, Communications and Cyber Systems
Recipient:	GEORGIA TECH RESEARCH CORP
Initial Amendment Date:	December 22, 2021
Latest Amendment Date:	May 16, 2023
Award Number:	2144316
Award Instrument:	Continuing Grant
Program Manager:	Eyad Abed eabed@nsf.gov (703)292-2303 ECCS Division of Electrical, Communications and Cyber Systems ENG Directorate for Engineering
Start Date:	May 1, 2022
End Date:	April 30, 2027 (Estimated)
Total Intended Award Amount:	$500,000.00
Total Awarded Amount to Date:	$560,000.00
Funds Obligated to Date:	FY 2022 = $469,539.00 FY 2023 = $90,461.00
History of Investigator:	Siva Theja Maguluri (Principal Investigator) siva.theja@gatech.edu
Recipient Sponsored Research Office:	Georgia Tech Research Corporation 926 DALNEY ST NW ATLANTA GA US 30318-6395 (404)894-4819
Sponsor Congressional District:	05
Primary Place of Performance:	Georgia Institute of Technology 225 North Avenue NW Atlanta GA US 30332-0002
Primary Place of Performance Congressional District:	05
Unique Entity Identifier (UEI):	EMW9FC8J3HN4
Parent UEI:	EMW9FC8J3HN4
NSF Program(s):	GVF - Global Venture Fund, EPCN-Energy-Power-Ctrl-Netwrks
Primary Program Source:	01002223DB NSF RESEARCH & RELATED ACTIVIT 01002324DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	1045, 120Z, 7607
Program Element Code(s):	054Y00, 760700
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.041, 47.079

ABSTRACT

Part I:
The ongoing Artificial Intelligence revolution is possible due to progresses in two distinct areas. The first is the development of novel algorithms in machine learning paradigms such as Reinforcement Learning, that overcome long-standing challenges; the second is the breakthroughs in cloud computing infrastructure based on large data centers that enables one to collect, store and process large amounts of data very easily and at a short notice. In spite of tremendous success stories in both these areas, fundamental trade-offs and optimal performance is not understand and theory lags far behind practice. In spite of seeming to be very distinct problems, both Reinforcement Learning and Cloud computing can be studied using stochastic recursions. The goal of this CAREER project is to take a unified theoretical viewpoint of both these seemingly distinct areas first developing a general theory of stochastic recursions, and then to use it to study both Reinforcement Learning and Cloud computing. In particular, we will use the theory to develop novel learning algorithms with provably optimal sample complexity across various paradigms such as off-policy learning and actor-critic framework. The theory of stochastic recursions as well as the novel learning algorithms will also be used to develop optimal scheduling algorithms for cloud computing data centers that minimize the tail of delay experienced by the users.
The novel algorithms developed during the course of this project will be implemented through collaborations with partners in industry as well as at Georgia Tech?s internal cloud. A Jupyter based open source RL simulation platform will be developed, and the novel algorithms developed during the course of this project will be included in this platform. The platform is used not only in dissemination of the outcome of this project, but also for undergraduate research projects, course projects for a new course on Reinforcement learning, and for STEM outreach activities to K-12 education. In addition to dissemination of research results through conferences and journal publications, we will develop a novel special topics course, and bring out a monograph on the unified Lyapunov framework for stochastic recursions. In addition, training of graduate and undergraduate students forms a core part of the project with special emphasis on mentoring future faculty.

Part 2:
Intellectual Merit:
The proposed work is organized into three interdependent thrusts.
Thrust I builds a Lyapunov theory of stochastic recursions, where we obtain finite-time mean square error and exponential tail bounds, as well as characterize the steady-state limiting distribution for a broad class of stochastic recursions. This thrust forms the foundation for the next two thrusts.
Thrust II studies the finite-time mean-square bounds, tail probability bounds (aka PAC bounds), sample complexity, and steady-state behavior of RL algorithms under three paradigms, viz., off-policy RL, two time-scale policy space algorithms (such as actor-critic) and average reward RL, and develops novel, fast, RL algorithms with near optimal sample efficiency.
Thrust-III studies scheduling problems in data center networks, with the goal of minimizing mean delay and delay tails. Using the Lyapunov theory from Thrust I, we develop novel low complexity algorithms with provable guarantees on steady-state delay in the heavy-traffic asymptotic regime. With these as initial policies, we will deploy RL algorithms from Thrust II to learn new scheduling policies that are optimal even in the preasymptotic regime, which is of practical interest. All the proposed algorithms will be evaluated using real world traffic traces through our collaborations with industry partners.
Broader Impacts:
The proposed work, and the PI?s ongoing industry collaborations have potential for significant societal impact by making RL and cloud computing more efficient. The proposed Lyapunov theory for Stochastic Recursions is applicable in many other disciplines. And so, the PI will disseminate it widely through a special topics course, a monograph, and tutorials, in addition to conference and journal publications.
The project integrates research with educational activities at every level. A Jupyter based RL simulation platform and a library of notebooks that we will build, will serve as an extensive pedagogical resource for these activities. The PI will continue his ongoing involvement in undergraduate research through the REU program and the VIP program at Georgia Tech. In order to fulfill a growing demand, the PI will develop a new interdisciplinary undergraduate level RL course and extensively use the RL simulation platform. To promote STEM activities, the PI will take part in outreach activities to local high schools working with an academic professional in ISyE and will mentor high school teachers through the GIFT program. To support Ph.D. students interested in academic career, the PI runs a future faculty mentorship program. The PI is committed to broadening participation, and currently advises a female Hispanic student, and has advised several URM undergraduate students.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 12)

Show All

Chen, Zaiwei and Clarke, John-Paul and Maguluri, Siva Theja "Target Network and Truncation Overcome the Deadly Triad in $\boldsymbol{Q}$-Learning" SIAM Journal on Mathematics of Data Science , v.5 , 2023 https://doi.org/10.1137/22M1499261 Citation Details

Chen, Zaiwei and Maguluri, Siva T and Shakkottai, Sanjay and Shanmugam, Karthikeyan "A Lyapunov Theory for Finite-Sample Guarantees of Markovian Stochastic Approximation" Operations Research , 2023 https://doi.org/10.1287/opre.2022.0249 Citation Details

Chen, Zaiwei and Zhang, Sheng and Doan, Thinh T. and Clarke, John-Paul and Maguluri, Siva Theja "Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning" Automatica , v.146 , 2022 https://doi.org/10.1016/j.automatica.2022.110623 Citation Details

Hurtado-Lange, Daniela and Varma, Sushil Mahavir and Maguluri, Siva Theja "Logarithmic heavy traffic error bounds in generalized switch and load balancing systems" Journal of Applied Probability , v.59 , 2022 https://doi.org/10.1017/jpr.2021.82 Citation Details

Khodadadian, Sajad and Doan, Thinh T. and Romberg, Justin and Maguluri, Siva Theja "Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm" IEEE Transactions on Automatic Control , 2022 https://doi.org/10.1109/TAC.2022.3190032 Citation Details

Khodadadian, Sajad and Sharma, Pranay and Joshi, Gauri and Maguluri, Siva Theja "Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling" International Conference on Machine Learning , 2022 Citation Details

Krishnan_K_S, Ashok and Singh, Chandramani and Maguluri, Siva Theja and Parag, Parimal "Optimal Pricing in a Single Server System" ACM Transactions on Modeling and Performance Evaluation of Computing Systems , v.8 , 2023 https://doi.org/10.1145/3607252 Citation Details

Mou, Shancong and Maguluri, Siva Theja "Heavy-traffic queue length behavior in a switch under Markovian arrivals" Advances in Applied Probability , 2024 https://doi.org/10.1017/apr.2023.60 Citation Details

Nguyen, Hoang and Maguluri, Siva Theja "Stochastic Approximation for Nonlinear Discrete Stochastic Control: Finite-Sample Bounds for Exponentially Stable Systems" , 2023 https://doi.org/10.1109/CDC49753.2023.10384244 Citation Details

Raj_Jhunjhunwala, Prakirt and Hurtado-Lange, Daniela and Theja_Maguluri, Siva "Exponential Tail Bounds on Queues: A Confluence of Non- Asymptotic Heavy Traffic and Large Deviations" ACM SIGMETRICS Performance Evaluation Review , v.51 , 2024 https://doi.org/10.1145/3649477.3649488 Citation Details

Raj_Jhunjhunwala, Prakirt and Theja_Maguluri, Siva "Heavy Traffic Joint Queue Length Distribution withoutResource Pooling" ACM SIGMETRICS Performance Evaluation Review , v.51 , 2024 https://doi.org/10.1145/3649477.3649487 Citation Details

(Showing: 1 - 10 of 12)

Show All

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error