C2MAOT: Cross-modal Complementary Masked Autoencoder with Optimal Transport for Cancer Segmentation in PET-CT Images

Jiaju Huang, Shaobin Chen, Xinglong Liang, Xiao Yang, Zhuoneng Zhang, Yue Sun, Ying Wang, Tao Tan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Accurate cancer segmentation in PET-CT images is crucial for oncology, yet remains challenging due to lesion diversity, data scarcity, and modality heterogeneity. Existing methods often struggle to effectively fuse cross-modal information and leverage self-supervised learning for improved representation. In this paper, we introduce C2MAOT, a Cross-modal Complementary Masked Autoencoder with Optimal Transport framework for PET-CT cancer segmentation. Our method employs a novel modality-complementary masking strategy during pre-training to explicitly encourage cross-modal learning between PET and CT encoders. Furthermore, we integrate an optimal transport loss to guide the alignment of feature distributions across modalities, facilitating robust multi-modal fusion. Experimental results on two datasets demonstrate that C2MAOT outperforms existing state-of-the-art methods, achieving significant improvements in segmentation accuracy across five cancer types. These results establish our proposed method as an effective approach for tumor segmentation in PET-CT imaging. Our code is available at https://github.com/hjj194/c2maot.

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, 2025, Proceedings
EditorsJames C. Gee, Jaesung Hong, Carole H. Sudre, Polina Golland, Daniel C. Alexander, Juan Eugenio Iglesias, Archana Venkataraman, Jong Hyo Kim
PublisherSpringer Science and Business Media Deutschland GmbH
Pages87-97
Number of pages11
ISBN (Print)9783032049261
DOIs
Publication statusPublished - 2026
Event28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - Daejeon, Korea, Republic of
Duration: 23 Sept 202527 Sept 2025

Publication series

NameLecture Notes in Computer Science
Volume15960 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
Country/TerritoryKorea, Republic of
CityDaejeon
Period23/09/2527/09/25

Keywords

  • Cross-modal Fusion
  • PET-CT Segmentation
  • Self-supervised Learning

Fingerprint

Dive into the research topics of 'C2MAOT: Cross-modal Complementary Masked Autoencoder with Optimal Transport for Cancer Segmentation in PET-CT Images'. Together they form a unique fingerprint.

Cite this