TY - JOUR
T1 - HFSA
T2 - Heterogeneous Teacher-Student Networks with Frequency-Spatial Fusion and Axial Feature Learning for Industrial Anomaly Detection
AU - Chen, Yue
AU - Huang, Guoheng
AU - Liao, Xianglian
AU - Yuan, Xiaochen
AU - Ling, Bingo Wing Kuen
AU - Zeng, An
AU - Pun, Chi Man
AU - Li, Yan
N1 - Publisher Copyright:
© 1975-2011 IEEE.
PY - 2025
Y1 - 2025
N2 - Knowledge distillation methods have demonstrated promising results in industrial systems. However, the high structural similarity and consistent data flow between teacher-student (T-S) networks may induce the student network to inadvertently mimic teacher outputs on anomalies. This eliminates activation differences during inference, depriving the student model of the basis to distinguish abnormal from normal samples. Furthermore, existing methods often struggle to establish long-term dependencies, which also leads to poor performance in detecting global shape anomalies. To address these issues, we propose HFSA, a Heterogeneous T-S model with Frequency-Spatial Fusion and Axial Feature Learning. The model adopts a pre-trained teacher encoder and a Frequency-Spatial Domain Fusion student Decoder (FSFD). We also designed a Cross-Scale Attention Bottleneck (CSAB) module to optimize the efficiency of knowledge distillation. Acting as a bridge connecting the T-S models, the CSAB suppresses redundant signals and enhances key information extracted by the teacher, transmitting the processed features to the FSFD. The heterogeneous dual-stream decoder FSFD comprises a Multi-frequency Response Module (MFRM) and a Local Feature Enhancement Convolution (LFE). The MFRM integrates low-frequency shape semantics with high-frequency texture details via frequency-domain analysis to enhance global shape modeling, while the LFE refines local features to complement this process. Through their collaboration, the FSFD precisely localizes micro-scale texture defects and macro-scale assembly errors, addressing quality control demands in consumer electronics manufacturing. Comprehensive experiments on four industrial benchmarks, demonstrate that HFSA sets a new performance benchmark for complex anomaly detection. More critically, it achieves this superior performance with a balanced computational overhead, establishing it as a practical deployment solution.
AB - Knowledge distillation methods have demonstrated promising results in industrial systems. However, the high structural similarity and consistent data flow between teacher-student (T-S) networks may induce the student network to inadvertently mimic teacher outputs on anomalies. This eliminates activation differences during inference, depriving the student model of the basis to distinguish abnormal from normal samples. Furthermore, existing methods often struggle to establish long-term dependencies, which also leads to poor performance in detecting global shape anomalies. To address these issues, we propose HFSA, a Heterogeneous T-S model with Frequency-Spatial Fusion and Axial Feature Learning. The model adopts a pre-trained teacher encoder and a Frequency-Spatial Domain Fusion student Decoder (FSFD). We also designed a Cross-Scale Attention Bottleneck (CSAB) module to optimize the efficiency of knowledge distillation. Acting as a bridge connecting the T-S models, the CSAB suppresses redundant signals and enhances key information extracted by the teacher, transmitting the processed features to the FSFD. The heterogeneous dual-stream decoder FSFD comprises a Multi-frequency Response Module (MFRM) and a Local Feature Enhancement Convolution (LFE). The MFRM integrates low-frequency shape semantics with high-frequency texture details via frequency-domain analysis to enhance global shape modeling, while the LFE refines local features to complement this process. Through their collaboration, the FSFD precisely localizes micro-scale texture defects and macro-scale assembly errors, addressing quality control demands in consumer electronics manufacturing. Comprehensive experiments on four industrial benchmarks, demonstrate that HFSA sets a new performance benchmark for complex anomaly detection. More critically, it achieves this superior performance with a balanced computational overhead, establishing it as a practical deployment solution.
KW - Anomaly detection
KW - axial feature learning
KW - frequency-spatial fusion
KW - knowledge distillation
KW - unsupervised learning
UR - https://www.scopus.com/pages/publications/105018013631
U2 - 10.1109/TCE.2025.3615986
DO - 10.1109/TCE.2025.3615986
M3 - Article
AN - SCOPUS:105018013631
SN - 0098-3063
JO - IEEE Transactions on Consumer Electronics
JF - IEEE Transactions on Consumer Electronics
ER -