Value Decomposition-Based Multi-Agent Learning for Anesthetics Collaborative Control

Huijie Li, Yide Yu, Si Shi, Anmin Hu, Jian Huo, Wei Lin, Chaoran Wu, Wuman Luo

Research output: Contribution to journalArticlepeer-review

Abstract

Automated control of personalized multiple anesthetics in clinical Total Intravenous Anesthesia (TIVA) is crucial yet challenging. Current systems, including target-controlled infusion (TCI) and closedloop systems, either rely on relatively static pharmacokinetic/pharmacodynamic (PK/PD) models or focus on single anesthetic control. So they limit both personalization and collaborative control. To address these issues, we propose a novel Value Decomposition MultiAgent Deep Reinforcement Learning (VD-MADRL) framework based on Markov Game (MG) for Personalized Multiple Anesthetics Control in a Closed-Loop system (PMAC-CL). VD-MADRL optimizes the collaboration between two anesthetics propofol (Agent I) and remifentanil (Agent II) by leveraging a MG to identify optimal actions among heterogeneous agents. We employ various value function decomposition methods to resolve the credit allocation problem and enhance collaborative control. We also introduce a multivariate environment model based on random forest (RF) for anesthesia state simulation. To ensure data validity, we design a data resampling and alignment technique to synchronize trajectory data from different devices, avoiding gradient explosion and maintaining conformity to Markov property. Extensive experiments on general and thoracic surgery datasets demonstrate that VD-MADRL provides more refined dose adjustments and maintains multiple anesthesia state indicators more stably at target levels compared to human experience. Especially, the bestperforming algorithm, VDN in general surgery with online training, achieved a 16.4% increase in cumulative reward (CR) and a 58.0% reduction in mean MDPE compared to human experience.

Original languageEnglish
JournalIEEE Journal of Biomedical and Health Informatics
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • multi-agent deep reinforcement learning
  • multiple anesthesia states
  • personalized anesthesia
  • value function decomposition

Fingerprint

Dive into the research topics of 'Value Decomposition-Based Multi-Agent Learning for Anesthetics Collaborative Control'. Together they form a unique fingerprint.

Cite this