TY - JOUR
T1 - Dual-Space Aggregation Learning and Random Erasure for Visible Infrared Person Re-Identification
AU - Qian, Yongheng
AU - Yang, Xu
AU - Tang, Su Kit
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - Visible infrared person re-identification (VI Re-ID) is of particular importance for an intelligent safe-guard system, aiming to retrieve the same pedestrian from non-overlapping visible and infrared cameras. The VI Re-ID task is extremely challenging due to significant modality differences, high-sample noise, occlusions, etc. To address these issues, we explore a dual-space aggregation learning (DSAL) method that combines instance-batch normalization (IBN) and residual shrinkage (RS) into a baseline model for feature learning and compression at the channel-level. The random erasing (RE) data augmentation method has been applied to preprocess the data. Experiments on two datasets demonstrate that: 1) IBN reduces shallow layer appearance differences and can bridge the gap between heterogeneous modalities; 2) The RS adaptive soft threshold sets the zero-domain features to zero to eliminate noise and clutter information, thereby enhancing the robustness of the network to noise; 3) RE data augmentation method significantly improves the model's generalization ability. Particularly, the design of DSAL can be seamlessly embedded into other CNN frameworks as a bottleneck variant without additional computation costs. Compared with the strong baseline, on SYSU-MM01, Rank-1, mAP, and mINP significantly improved by 10.66%, 7.78%, and 5.91%, respectively. On RegDB, Rank-1, mAP, and mINP significantly improved by 16.40%, 13.83%, and 19.07%, respectively.
AB - Visible infrared person re-identification (VI Re-ID) is of particular importance for an intelligent safe-guard system, aiming to retrieve the same pedestrian from non-overlapping visible and infrared cameras. The VI Re-ID task is extremely challenging due to significant modality differences, high-sample noise, occlusions, etc. To address these issues, we explore a dual-space aggregation learning (DSAL) method that combines instance-batch normalization (IBN) and residual shrinkage (RS) into a baseline model for feature learning and compression at the channel-level. The random erasing (RE) data augmentation method has been applied to preprocess the data. Experiments on two datasets demonstrate that: 1) IBN reduces shallow layer appearance differences and can bridge the gap between heterogeneous modalities; 2) The RS adaptive soft threshold sets the zero-domain features to zero to eliminate noise and clutter information, thereby enhancing the robustness of the network to noise; 3) RE data augmentation method significantly improves the model's generalization ability. Particularly, the design of DSAL can be seamlessly embedded into other CNN frameworks as a bottleneck variant without additional computation costs. Compared with the strong baseline, on SYSU-MM01, Rank-1, mAP, and mINP significantly improved by 10.66%, 7.78%, and 5.91%, respectively. On RegDB, Rank-1, mAP, and mINP significantly improved by 16.40%, 13.83%, and 19.07%, respectively.
KW - Cross-modality re-identification
KW - instance-batch normalization
KW - modality differences
KW - noise
KW - occlusion
KW - residual shrinkage network
UR - http://www.scopus.com/inward/record.url?scp=85165318125&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3297891
DO - 10.1109/ACCESS.2023.3297891
M3 - Article
AN - SCOPUS:85165318125
SN - 2169-3536
VL - 11
SP - 75440
EP - 75450
JO - IEEE Access
JF - IEEE Access
ER -