ClinCoCoOp: An Interpretable Prompt Learning Framework with Clinical Concept Guidance for Context Optimization

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Large Vision-Language Models (VLMs) demonstrate significant potential in representation learning and exhibit strong performance across diverse downstream tasks. Soft prompt learning has emerged as an effective technique for adapting VLMs like CLIP to image classification. However, prevailing prompt learning methods typically generate non-interpretable text tokens, failing to satisfy the stringent interpretability requirements of eXplainable AI (XAI) in high-risk domains such as healthcare. To address this limitation, we introduce a novel interpretable prompt learning framework. Our approach enhances interpretability by incorporating clinical concepts and aligns image semantics with learnable prompts at multiple granularities. Departing from existing methods that apply uniform clinical concept weights across all prompts, we propose two key modules: (1) a Soft-Prompt Clinical Concept Alignment module, which computes image-concept similarity scores to weight clinical concepts before aligning them with the soft prompt (a set of learnable vectors), and (2) a Global-Local Image Soft-Prompt Alignment module, which processes local image regions by incorporating positional encodings and calculating significance weights, complementing the global alignment. Extensive experiments on three medical image datasets (Derm7pt, ISIC2018, Pneumonia) demonstrate the superior classification performance of our method name Clinical Concept CoOp(ClinCoCoOp). Notably, ClinCoCoOp also achieves outstanding zero-shot transfer results on the MED-NODE and ISIC2019 datasets.

Original languageEnglish
Title of host publicationPattern Recognition and Computer Vision - 8th Chinese Conference, PRCV 2025, Proceedings
EditorsJosef Kittler, Hongkai Xiong, Jian Yang, Xilin Chen, Jiwen Lu, Weiyao Lin, Jingyi Yu, Weishi Zheng
PublisherSpringer Science and Business Media Deutschland GmbH
Pages106-119
Number of pages14
ISBN (Print)9789819556786
DOIs
Publication statusPublished - 2026
Event8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025 - Shanghai, China
Duration: 15 Oct 202518 Oct 2025

Publication series

NameLecture Notes in Computer Science
Volume16277 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025
Country/TerritoryChina
CityShanghai
Period15/10/2518/10/25

Keywords

  • Interpretability
  • LLM
  • Prompt Learning
  • VLM
  • Zero shot

Fingerprint

Dive into the research topics of 'ClinCoCoOp: An Interpretable Prompt Learning Framework with Clinical Concept Guidance for Context Optimization'. Together they form a unique fingerprint.

Cite this