跳至主導覽 跳至搜尋 跳過主要內容

Bootstrapping BI-RADS classification using large language models and transformers in breast magnetic resonance imaging reports

  • Yuxin Liu
  • , Xiang Zhang
  • , Weiwei Cao
  • , Wenju Cui
  • , Tao Tan
  • , Yuqin Peng
  • , Jiayi Huang
  • , Zhen Lei
  • , Jun Shen
  • , Jian Zheng
  • University of Science and Technology of China
  • CAS - Suzhou Institute of Biomedical Engineering and Technology
  • Sun Yat-Sen University
  • Shandong University
  • CAS - Institute of Automation

研究成果: Article同行評審

4 引文 斯高帕斯(Scopus)

摘要

Breast cancer is one of the most common malignancies among women globally. Magnetic resonance imaging (MRI), as the final non-invasive diagnostic tool before biopsy, provides detailed free-text reports that support clinical decision-making. Therefore, the effective utilization of the information in MRI reports to make reliable decisions is crucial for patient care. This study proposes a novel method for BI-RADS classification using breast MRI reports. Large language models are employed to transform free-text reports into structured reports. Specifically, missing category information (MCI) that is absent in the free-text reports is supplemented by assigning default values to the missing categories in the structured reports. To ensure data privacy, a locally deployed Qwen-Chat model is employed. Furthermore, to enhance the domain-specific adaptability, a knowledge-driven prompt is designed. The Qwen-7B-Chat model is fine-tuned specifically for structuring breast MRI reports. To prevent information loss and enable comprehensive learning of all report details, a fusion strategy is introduced, combining free-text and structured reports to train the classification model. Experimental results show that the proposed BI-RADS classification method outperforms existing report classification methods across multiple evaluation metrics. Furthermore, an external test set from a different hospital is used to validate the robustness of the proposed approach. The proposed structured method surpasses GPT-4o in terms of performance. Ablation experiments confirm that the knowledge-driven prompt, MCI, and the fusion strategy are crucial to the model’s performance.

原文English
文章編號8
期刊Visual Computing for Industry, Biomedicine, and Art
8
發行號1
DOIs
出版狀態Published - 12月 2025

UN SDG

此研究成果有助於以下永續發展目標

  1. Good health and well being
    Good health and well being

指紋

深入研究「Bootstrapping BI-RADS classification using large language models and transformers in breast magnetic resonance imaging reports」主題。共同形成了獨特的指紋。

引用此