TY - JOUR
T1 - Value Decomposition-Based Multi-Agent Learning for Anesthetics Collaborative Control
AU - Li, Huijie
AU - Yu, Yide
AU - Shi, Si
AU - Hu, Anmin
AU - Huo, Jian
AU - Lin, Wei
AU - Wu, Chaoran
AU - Luo, Wuman
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - Automated control of personalized multiple anesthetics in clinical Total Intravenous Anesthesia (TIVA) is crucial yet challenging. Current systems, including target-controlled infusion (TCI) and closedloop systems, either rely on relatively static pharmacokinetic/pharmacodynamic (PK/PD) models or focus on single anesthetic control. So they limit both personalization and collaborative control. To address these issues, we propose a novel Value Decomposition MultiAgent Deep Reinforcement Learning (VD-MADRL) framework based on Markov Game (MG) for Personalized Multiple Anesthetics Control in a Closed-Loop system (PMAC-CL). VD-MADRL optimizes the collaboration between two anesthetics propofol (Agent I) and remifentanil (Agent II) by leveraging a MG to identify optimal actions among heterogeneous agents. We employ various value function decomposition methods to resolve the credit allocation problem and enhance collaborative control. We also introduce a multivariate environment model based on random forest (RF) for anesthesia state simulation. To ensure data validity, we design a data resampling and alignment technique to synchronize trajectory data from different devices, avoiding gradient explosion and maintaining conformity to Markov property. Extensive experiments on general and thoracic surgery datasets demonstrate that VD-MADRL provides more refined dose adjustments and maintains multiple anesthesia state indicators more stably at target levels compared to human experience. Especially, the bestperforming algorithm, VDN in general surgery with online training, achieved a 16.4% increase in cumulative reward (CR) and a 58.0% reduction in mean MDPE compared to human experience.
AB - Automated control of personalized multiple anesthetics in clinical Total Intravenous Anesthesia (TIVA) is crucial yet challenging. Current systems, including target-controlled infusion (TCI) and closedloop systems, either rely on relatively static pharmacokinetic/pharmacodynamic (PK/PD) models or focus on single anesthetic control. So they limit both personalization and collaborative control. To address these issues, we propose a novel Value Decomposition MultiAgent Deep Reinforcement Learning (VD-MADRL) framework based on Markov Game (MG) for Personalized Multiple Anesthetics Control in a Closed-Loop system (PMAC-CL). VD-MADRL optimizes the collaboration between two anesthetics propofol (Agent I) and remifentanil (Agent II) by leveraging a MG to identify optimal actions among heterogeneous agents. We employ various value function decomposition methods to resolve the credit allocation problem and enhance collaborative control. We also introduce a multivariate environment model based on random forest (RF) for anesthesia state simulation. To ensure data validity, we design a data resampling and alignment technique to synchronize trajectory data from different devices, avoiding gradient explosion and maintaining conformity to Markov property. Extensive experiments on general and thoracic surgery datasets demonstrate that VD-MADRL provides more refined dose adjustments and maintains multiple anesthesia state indicators more stably at target levels compared to human experience. Especially, the bestperforming algorithm, VDN in general surgery with online training, achieved a 16.4% increase in cumulative reward (CR) and a 58.0% reduction in mean MDPE compared to human experience.
KW - multi-agent deep reinforcement learning
KW - multiple anesthesia states
KW - personalized anesthesia
KW - value function decomposition
UR - https://www.scopus.com/pages/publications/105014002627
U2 - 10.1109/JBHI.2025.3599210
DO - 10.1109/JBHI.2025.3599210
M3 - Article
C2 - 40828726
AN - SCOPUS:105014002627
SN - 2168-2194
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
ER -