Contrastive Knowledge-Guided Large Language Models for Medical Report Generation

  • Yuyang Sha
  • , Hongxin Pan
  • , Weiyu Meng
  • , Kefeng Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Automatic medical report generation (MRG) holds considerable research value and has the potential to significantly alleviate the workload of radiologists. Recently, the rapid development of large language models (LLMs) has improved the performance of MRG. However, numerous challenges still need to be addressed to achieve highly accurate medical reports. For instance, most existing methods struggle to interpret image details, lack relevant medical knowledge, and overlook fine-grained cross-modality alignment. To overcome these limitations, we propose a knowledge-guided vision-language alignment framework with contrastive learning and LLMs for medical report generation. The designed method leverages visual representations, relevant medical knowledge, and enhanced features to generate accurate reports via the LLMs-based decoder. To improve the integration of medical-related information, we introduce the Knowledge Injection Module, which enhances the model’s feature representation capabilities while unlocking medical domain knowledge in LLMs. Inspired by the contrastive learning scheme, we introduce the Contrastive Alignment Module to align the visual features and textual information effectively. Additionally, the Cross-Modality Enhancement Module can retrieve similar reports for the input images to boost diagnostic accuracy. We conduct extensive experiments on two popular benchmark datasets, including IU X-Ray and MIMIC-CXR. The results demonstrate that our proposed method achieves promising performance compared with state-of-the-art frameworks.

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings
EditorsJames C. Gee, Jaesung Hong, Carole H. Sudre, Polina Golland, Daniel C. Alexander, Juan Eugenio Iglesias, Archana Venkataraman, Jong Hyo Kim
PublisherSpringer Science and Business Media Deutschland GmbH
Pages111-120
Number of pages10
ISBN (Print)9783032049773
DOIs
Publication statusPublished - 2026
Event28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - Daejeon, Korea, Republic of
Duration: 23 Sept 202527 Sept 2025

Publication series

NameLecture Notes in Computer Science
Volume15965 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
Country/TerritoryKorea, Republic of
CityDaejeon
Period23/09/2527/09/25

Keywords

  • Contrastive Learning
  • Cross-Modality Alignment
  • Knowledge Graph
  • Large Language Models
  • Medical Report Generation

Fingerprint

Dive into the research topics of 'Contrastive Knowledge-Guided Large Language Models for Medical Report Generation'. Together they form a unique fingerprint.

Cite this