Parallel Multimodal Language Model: Enhanced Breast Nodule Diagnosis through Parallel Multimodal Representations and Large Language Models

Research output: Contribution to journalArticlepeer-review

Abstract

Large language models (LLMs) have emerged in medical image analysis and can provide accurate and personalized medical services for doctors and patients. However, by simply utilizing textual information and ignoring other modal details such as images, LLMs fail to achieve high accuracy in the early diagnosis of breast cancer and thus have not yet been seamlessly integrated into the clinical practice of breast cancer diagnosis. Therefore, this study proposes that the Parallel Multimodal Language Model (PMLM) combines images and text, integrates visual and semantic information in text for early screening and diagnosis of breast cancer, and improves the accuracy of early screening and diagnosis, while also enhancing health system access. In addition, existing multimodal diagnostic methods are evaluated. The final experimental results reveal that the PMLM achieves an F1 of 0.87 [95% CI: 0.85–0.89] and an Area Under Curve (AUC) of 0.90 [95% CI, 0.89, 0.92] in the early diagnosis of breast cancer, both of which exceeded those of the existing baseline model.

Original languageEnglish
Article number2500085
JournalAdvanced Intelligent Systems
Volume8
Issue number1
DOIs
Publication statusPublished - Jan 2026

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • breast cancer
  • large language models
  • medical diagnosis
  • multimodal learning
  • ultrasound report

Fingerprint

Dive into the research topics of 'Parallel Multimodal Language Model: Enhanced Breast Nodule Diagnosis through Parallel Multimodal Representations and Large Language Models'. Together they form a unique fingerprint.

Cite this