TY - JOUR
T1 - MADAT
T2 - Missing-aware dynamic adaptive transformer model for medical prognosis prediction with incomplete multimodal data
AU - He, Jianbin
AU - Huang, Guoheng
AU - Yuan, Xiaochen
AU - Pun, Chi Man
AU - Zhong, Guo
AU - Yang, Qi
AU - Guo, Ling
AU - Zhu, Siyu
AU - Lei, Baiying
AU - Li, Haojiang
N1 - Publisher Copyright:
Copyright © 2026 Elsevier B.V. All rights reserved.
PY - 2026/5/1
Y1 - 2026/5/1
N2 - Multimodal medical prognosis prediction has shown great potential in improving diagnostic accuracy by integrating various data types. However, incomplete multimodality, where certain modalities are missing, poses significant challenges to model performance. Current methods, including dynamic adaptation and modality completion, have limitations in handling incomplete multimodality comprehensively. Dynamic adaptation methods fail to fully utilize modality interactions as they only process available modalities. Modality completion methods address inter-modal relationships but risk generating unreliable data, especially when key modalities are missing, since existing modalities cannot replicate unique features of absent ones. This compromises fusion quality and degrades model performance. To address these challenges, we propose the Missing-aware Dynamic Adaptive Transformer (MADAT) model, which integrates two phases: the Decoupling Generalization Completion Phase (DGCP), the Adaptive Cross-Fusion Phase (ACFP). The DGCP reconstructs missing modalities by generating inter-modal and intra-modal shared information using Progressive Transformation Recursive Gated Convolutions (PTRGC) and Wavelet Alignment Domain Generalization (WADG). The ACFP, which incorporates Cross-Agent Attention (CAA) and Generation Quality Feedback Regulation (GQFR), adaptively fuses the original and generated modality features. CAA ensures thorough integration and alignment of the features, while GQFR dynamically adjusts the model's reliance on the generated features based on their quality, preventing over-dependence on low-quality data. Experiments on three private nasopharyngeal carcinoma datasets demonstrate that MADAT outperforms existing methods, achieving superior robustness in medical multimodal prediction under conditions of incomplete multimodality.
AB - Multimodal medical prognosis prediction has shown great potential in improving diagnostic accuracy by integrating various data types. However, incomplete multimodality, where certain modalities are missing, poses significant challenges to model performance. Current methods, including dynamic adaptation and modality completion, have limitations in handling incomplete multimodality comprehensively. Dynamic adaptation methods fail to fully utilize modality interactions as they only process available modalities. Modality completion methods address inter-modal relationships but risk generating unreliable data, especially when key modalities are missing, since existing modalities cannot replicate unique features of absent ones. This compromises fusion quality and degrades model performance. To address these challenges, we propose the Missing-aware Dynamic Adaptive Transformer (MADAT) model, which integrates two phases: the Decoupling Generalization Completion Phase (DGCP), the Adaptive Cross-Fusion Phase (ACFP). The DGCP reconstructs missing modalities by generating inter-modal and intra-modal shared information using Progressive Transformation Recursive Gated Convolutions (PTRGC) and Wavelet Alignment Domain Generalization (WADG). The ACFP, which incorporates Cross-Agent Attention (CAA) and Generation Quality Feedback Regulation (GQFR), adaptively fuses the original and generated modality features. CAA ensures thorough integration and alignment of the features, while GQFR dynamically adjusts the model's reliance on the generated features based on their quality, preventing over-dependence on low-quality data. Experiments on three private nasopharyngeal carcinoma datasets demonstrate that MADAT outperforms existing methods, achieving superior robustness in medical multimodal prediction under conditions of incomplete multimodality.
KW - Dynamic adaptation
KW - Medical prognosis prediction
KW - Missing modality
KW - Modality completion
KW - Multimodal data
UR - https://www.scopus.com/pages/publications/105033861784
U2 - 10.1016/j.media.2026.103958
DO - 10.1016/j.media.2026.103958
M3 - Article
C2 - 41621207
AN - SCOPUS:105033861784
SN - 1361-8415
VL - 110
SP - 103958
JO - Medical Image Analysis
JF - Medical Image Analysis
ER -