摘要
Accurate segmentation of breast cancer in PET-CT images is crucial for precise staging, monitoring treatment response, and guiding personalized therapy. However, the small size and dispersed nature of metastatic lesions, coupled with the scarcity of annotated data and heterogeneity between modalities that hinders effective information fusion, make this task challenging. This paper proposes a novel anatomy-guided cross-modal learning framework to address these issues. Our approach first generates organ pseudo-labels through a teacher-student learning paradigm, which serve as anatomical prompts to guide cancer segmentation. We then introduce a self-aligning cross-modal pre-training method that aligns PET and CT features in a shared latent space through masked 3D patch reconstruction, enabling effective cross-modal feature fusion. Finally, we initialize the segmentation network’s encoder with the pre-trained encoder weights, and incorporate organ labels through a Mamba-based prompt encoder and Hypernet-Controlled Cross-Attention mechanism for dynamic anatomical feature extraction and fusion. Notably, our method outperforms eight state-of-the-art methods, including CNN-based, transformer-based, and Mamba-based approaches, on two datasets encompassing primary breast cancer, metastatic breast cancer, and other types of cancer segmentation tasks.
| 原文 | English |
|---|---|
| 文章編號 | 103956 |
| 期刊 | Medical Image Analysis |
| 卷 | 110 |
| DOIs | |
| 出版狀態 | Published - 5月 2026 |
UN SDG
此研究成果有助於以下永續發展目標
-
Good health and well being
指紋
深入研究「Anatomy-guided prompting with cross-modal self-alignment for whole-body PET-CT breast cancer segmentation」主題。共同形成了獨特的指紋。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver