Dual-Space Aggregation Learning and Random Erasure for Visible Infrared Person Re-Identification

Yongheng Qian, Xu Yang, Su Kit Tang

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Visible infrared person re-identification (VI Re-ID) is of particular importance for an intelligent safe-guard system, aiming to retrieve the same pedestrian from non-overlapping visible and infrared cameras. The VI Re-ID task is extremely challenging due to significant modality differences, high-sample noise, occlusions, etc. To address these issues, we explore a dual-space aggregation learning (DSAL) method that combines instance-batch normalization (IBN) and residual shrinkage (RS) into a baseline model for feature learning and compression at the channel-level. The random erasing (RE) data augmentation method has been applied to preprocess the data. Experiments on two datasets demonstrate that: 1) IBN reduces shallow layer appearance differences and can bridge the gap between heterogeneous modalities; 2) The RS adaptive soft threshold sets the zero-domain features to zero to eliminate noise and clutter information, thereby enhancing the robustness of the network to noise; 3) RE data augmentation method significantly improves the model's generalization ability. Particularly, the design of DSAL can be seamlessly embedded into other CNN frameworks as a bottleneck variant without additional computation costs. Compared with the strong baseline, on SYSU-MM01, Rank-1, mAP, and mINP significantly improved by 10.66%, 7.78%, and 5.91%, respectively. On RegDB, Rank-1, mAP, and mINP significantly improved by 16.40%, 13.83%, and 19.07%, respectively.

Original languageEnglish
Pages (from-to)75440-75450
Number of pages11
JournalIEEE Access
Volume11
DOIs
Publication statusPublished - 2023

Keywords

  • Cross-modality re-identification
  • instance-batch normalization
  • modality differences
  • noise
  • occlusion
  • residual shrinkage network

Fingerprint

Dive into the research topics of 'Dual-Space Aggregation Learning and Random Erasure for Visible Infrared Person Re-Identification'. Together they form a unique fingerprint.

Cite this