Sequential State Q-learning Uplink Resource Allocation in Multi-AP 802.11be Network

Yue Liu, Yide Yu, Zhenyu Du, Laurie Cuthbert

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)


Expected high demand of user applications in the WLAN is a driver for WLANs to share radio resources more efficiently. The move to 802.11be with OFDMA and MU-MIMO makes Radio Resource Management (RRM) a multi-dimensional problem in a complex wireless environment. Traditionally, the way that an RRM problem is formulated always leads to either a large state space or action space, which makes reinforcement learning impossible to be applied. In this paper, we propose a Sequential State Q-learning algorithm (SSQL) aimed at solving the Resource Unit (RU) allocation for scheduled uplink transmission to maximize system bitrate in a multi-AP 802.11be OFDMA network. The AP acts as the agent with the serving stations as 'states' and their RU allocations as 'actions'. The AP observes the wireless environment, continuously refreshing the Q-values of the state-action pairs and outputs the RU allocation to optimize the objective. Through simulations, we demonstrated that the performance of SSQL is 89.67% of the global optimal with very fast convergence, which makes it more practical for use in varying wireless networks.

Original languageEnglish
Title of host publication2022 IEEE 96th Vehicular Technology Conference, VTC 2022-Fall 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665454681
Publication statusPublished - 2022
Event96th IEEE Vehicular Technology Conference, VTC 2022-Fall 2022 - London, United Kingdom
Duration: 26 Sept 202229 Sept 2022

Publication series

NameIEEE Vehicular Technology Conference
ISSN (Print)1550-2252


Conference96th IEEE Vehicular Technology Conference, VTC 2022-Fall 2022
Country/TerritoryUnited Kingdom


  • IEEE 802.11be
  • Markov Decision Process
  • Q-learning
  • Radio Resource Management


Dive into the research topics of 'Sequential State Q-learning Uplink Resource Allocation in Multi-AP 802.11be Network'. Together they form a unique fingerprint.

Cite this