TY - JOUR
T1 - All-in-one medical image-to-image translation
AU - Han, Luyi
AU - Tan, Tao
AU - Huang, Yunzhi
AU - Dou, Haoran
AU - Zhang, Tianyu
AU - Gao, Yuan
AU - Wang, Xin
AU - Lu, Chunyao
AU - Liang, Xinglong
AU - Sun, Yue
AU - Teuwen, Jonas
AU - Zhou, S. Kevin
AU - Mann, Ritse
N1 - Publisher Copyright:
© 2025 The Author(s)
PY - 2025/8/18
Y1 - 2025/8/18
N2 - The growing availability of public multi-domain medical image datasets enables training omnipotent image-to-image (I2I) translation models. However, integrating diverse protocols poses challenges in domain encoding and scalability. Therefore, we propose the “every domain all at once” I2I (EVA-I2I) translation model using DICOM-tag-informed contrastive language-image pre-training (DCLIP). DCLIP maps natural language scan descriptions into a common latent space, offering richer representations than traditional one-hot encoding. We develop the model using seven public datasets with 27,950 scans (3D volumes) for the brain, breast, abdomen, and pelvis. Experimental results show that our EVA-I2I can synthesize every seen domain at once with a single training session and achieve excellent image quality on different I2I translation tasks. Results for downstream applications (e.g., registration, classification, and segmentation) demonstrate that EVA-I2I can be directly applied to domain adaptation on external datasets without fine-tuning and that it also enables the potential for zero-shot domain adaptation for never-before-seen domains.
AB - The growing availability of public multi-domain medical image datasets enables training omnipotent image-to-image (I2I) translation models. However, integrating diverse protocols poses challenges in domain encoding and scalability. Therefore, we propose the “every domain all at once” I2I (EVA-I2I) translation model using DICOM-tag-informed contrastive language-image pre-training (DCLIP). DCLIP maps natural language scan descriptions into a common latent space, offering richer representations than traditional one-hot encoding. We develop the model using seven public datasets with 27,950 scans (3D volumes) for the brain, breast, abdomen, and pelvis. Experimental results show that our EVA-I2I can synthesize every seen domain at once with a single training session and achieve excellent image quality on different I2I translation tasks. Results for downstream applications (e.g., registration, classification, and segmentation) demonstrate that EVA-I2I can be directly applied to domain adaptation on external datasets without fine-tuning and that it also enables the potential for zero-shot domain adaptation for never-before-seen domains.
KW - CP: Computational biology
KW - CP: Imaging
KW - contrastive language-image pre-training
KW - image-to-image translation
KW - multi-domain medical image
KW - representation learning
KW - zero-shot domain adaptation
UR - https://www.scopus.com/pages/publications/105012984031
U2 - 10.1016/j.crmeth.2025.101138
DO - 10.1016/j.crmeth.2025.101138
M3 - Article
AN - SCOPUS:105012984031
SN - 2667-2375
VL - 5
JO - Cell Reports Methods
JF - Cell Reports Methods
IS - 8
M1 - 101138
ER -