Multi-scale feature fusion for cross-modality person re-identification: the MSJLNet approach

  • Zhixin Tie
  • , Haobiao Fan
  • , Lingbing Tao
  • , Yanbing Chen
  • , Hao Sheng
  • , Wei Ke

Research output: Contribution to journalArticlepeer-review

Abstract

Visible-Infrared person re-identification (VI-ReID) faces significant challenges due to discrepancies between visible and infrared images. Traditional two-stream networks often struggle to preserve semantic guidance from data augmentation as network depth increases. To address this, we propose the Multi-Scale Joint Learning Network (MSJLNet), which employs a novel four-stream architecture to segregate data-augmented branches from original branches, focusing on extracting robust and color-agnostic modal features. An Information Purification Module (IPM) with a channel attention mechanism is designed to dynamically filter noise and suppress redundant color information in the augmented branches. Furthermore, a Joint Semantic Learning Module (JSLM) effectively fuses global detail features with color-agnostic features, improving the model’s discriminative ability. Extensive experiments on the SYSU-MM01 and RegDB datasets demonstrate MSJLNet’s superior performance, achieving 79.94% Rank-1 accuracy and 74.96% mAP on SYSU-MM01, and 93.14% Rank-1 accuracy and 87.22% mAP on RegDB. The proposed approach offers new insights for enhancing cross-modality feature learning. Code is available at https://github.com/1849714926/MSJLNet.

Original languageEnglish
Article number146
JournalVisual Computer
Volume42
Issue number2
DOIs
Publication statusPublished - Jan 2026

Keywords

  • Channel Augmentation
  • Cross-Modality
  • Feature Alignment
  • Person ReID

Fingerprint

Dive into the research topics of 'Multi-scale feature fusion for cross-modality person re-identification: the MSJLNet approach'. Together they form a unique fingerprint.

Cite this