TY - JOUR
T1 - Parallel Multimodal Language Model
T2 - Enhanced Breast Nodule Diagnosis through Parallel Multimodal Representations and Large Language Models
AU - Zheng, Dashun
AU - Pang, Patrick Cheong Iao
AU - He, Ping
AU - Sun, Yue
AU - Yan, Hongju
AU - Xiong, Xiangyu
AU - Cui, Ligang
AU - Tan, Tao
AU - Bao, Lingyun
N1 - Publisher Copyright:
© 2025 The Author(s). Advanced Intelligent Systems published by Wiley-VCH GmbH.
PY - 2025
Y1 - 2025
N2 - Large language models (LLMs) have emerged in medical image analysis and can provide accurate and personalized medical services for doctors and patients. However, by simply utilizing textual information and ignoring other modal details such as images, LLMs fail to achieve high accuracy in the early diagnosis of breast cancer and thus have not yet been seamlessly integrated into the clinical practice of breast cancer diagnosis. Therefore, this study proposes that the Parallel Multimodal Language Model (PMLM) combines images and text, integrates visual and semantic information in text for early screening and diagnosis of breast cancer, and improves the accuracy of early screening and diagnosis, while also enhancing health system access. In addition, existing multimodal diagnostic methods are evaluated. The final experimental results reveal that the PMLM achieves an F1 of 0.87 [95% CI: 0.85–0.89] and an Area Under Curve (AUC) of 0.90 [95% CI, 0.89, 0.92] in the early diagnosis of breast cancer, both of which exceeded those of the existing baseline model.
AB - Large language models (LLMs) have emerged in medical image analysis and can provide accurate and personalized medical services for doctors and patients. However, by simply utilizing textual information and ignoring other modal details such as images, LLMs fail to achieve high accuracy in the early diagnosis of breast cancer and thus have not yet been seamlessly integrated into the clinical practice of breast cancer diagnosis. Therefore, this study proposes that the Parallel Multimodal Language Model (PMLM) combines images and text, integrates visual and semantic information in text for early screening and diagnosis of breast cancer, and improves the accuracy of early screening and diagnosis, while also enhancing health system access. In addition, existing multimodal diagnostic methods are evaluated. The final experimental results reveal that the PMLM achieves an F1 of 0.87 [95% CI: 0.85–0.89] and an Area Under Curve (AUC) of 0.90 [95% CI, 0.89, 0.92] in the early diagnosis of breast cancer, both of which exceeded those of the existing baseline model.
KW - breast cancer
KW - large language models
KW - medical diagnosis
KW - multimodal learning
KW - ultrasound report
UR - https://www.scopus.com/pages/publications/105011159724
U2 - 10.1002/aisy.202500085
DO - 10.1002/aisy.202500085
M3 - Article
AN - SCOPUS:105011159724
SN - 2640-4567
JO - Advanced Intelligent Systems
JF - Advanced Intelligent Systems
ER -