跳至主導覽 跳至搜尋 跳過主要內容

Doubly Relaxed Knowledge Distillation for Deep Face Recognition

  • Si Zhou
  • , Xiaochen Yuan
  • , Guanghua Yang
  • , Xinyuan Zhang
  • , Zuobin Ying
  • , Xueyuan Gong

研究成果: Article同行評審

摘要

In face recognition tasks, knowledge distillation tends to employ feature-based methods due to their better performance, while neglecting further research on logits-based distillation methods. However, the former strictly requires consistent feature dimensions between the teacher and student, which calls for additional design work. In this paper, we find that most logits-based methods not only strictly constrain the logits variances of the teacher and student to be consistent, but also implicitly involve a restricted linear relationship. Therefore, we propose a new logits-based distillation method, named Doubly Relaxed Knowledge Distillation (DR-KD). Specifically, we use Z-score standardization instead of the softmax function to process logits, and achieve consistent logits variances through manual intervention. Subsequently, we introduce cosine similarity to measure the knowledge transfer between teacher and student networks. Through derivation, we find that the Pearson coefficient can be directly and equivalently used to calculate the similarity between their original logits. The linear relationship represented by the Pearson coefficient is more extensive, and the restricted linear relationship is a special case of it. Our method can break the constraint on variance consistency and relax the restriction on linear relationships. As a result, extensive experiments and ablation studies show that our DR-KD can enhance the discriminative learning capability of the student and demonstrate superiority over various state-of-the-art competitors on several face challenging benchmarks, such as IJB-B, IJB-C, and so on.

指紋

深入研究「Doubly Relaxed Knowledge Distillation for Deep Face Recognition」主題。共同形成了獨特的指紋。

引用此