Parallel Multimodal Language Model: Enhanced Breast Nodule Diagnosis through Parallel Multimodal Representations and Large Language Models

Dashun Zheng, Patrick Cheong Iao Pang, Ping He, Yue Sun, Hongju Yan, Xiangyu Xiong, Ligang Cui, Tao Tan, Lingyun Bao

Research output: Contribution to journalArticlepeer-review

Abstract

Large language models (LLMs) have emerged in medical image analysis and can provide accurate and personalized medical services for doctors and patients. However, by simply utilizing textual information and ignoring other modal details such as images, LLMs fail to achieve high accuracy in the early diagnosis of breast cancer and thus have not yet been seamlessly integrated into the clinical practice of breast cancer diagnosis. Therefore, this study proposes that the Parallel Multimodal Language Model (PMLM) combines images and text, integrates visual and semantic information in text for early screening and diagnosis of breast cancer, and improves the accuracy of early screening and diagnosis, while also enhancing health system access. In addition, existing multimodal diagnostic methods are evaluated. The final experimental results reveal that the PMLM achieves an F1 of 0.87 [95% CI: 0.85–0.89] and an Area Under Curve (AUC) of 0.90 [95% CI, 0.89, 0.92] in the early diagnosis of breast cancer, both of which exceeded those of the existing baseline model.

Original languageEnglish
JournalAdvanced Intelligent Systems
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • breast cancer
  • large language models
  • medical diagnosis
  • multimodal learning
  • ultrasound report

Fingerprint

Dive into the research topics of 'Parallel Multimodal Language Model: Enhanced Breast Nodule Diagnosis through Parallel Multimodal Representations and Large Language Models'. Together they form a unique fingerprint.

Cite this