A Knowledge- Distillation - Integrated Pruning Method for Vision Transformer

Bangguo Xu, Tiankui Zhang, Yapeng Wang, Zeren Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Vision transformers (ViTs) have made remarkable achievements in various computer vision applications such as image classification, object detection, and image segmentation. Since the self-attention mechanism introduced by itself can model the relationship between all pixels of the input image, the performance of the ViTs model is significantly improved compared to the traditional CNN network. However, their storage, runtime memory and computing requirements hinder their deployment on edge devices. This paper proposes a ViT pruning method with knowledge distillation, which can prune the ViT model and avoid the performance loss of the model after pruning. Based on the idea that knowledge distillation can make the student model improve the performance of the model by learning the unique knowledge of the teacher model, the convolution neural network (CNN) which has the unique ability of parameter sharing and local receptive field is used as a teacher model to guide the training of the ViT model and enable the ViT model to obtain the same ability. In addition, some important parts may be cut during pruning, resulting in irreversible loss of model performance. To solve this problem, this paper designs the importance score learning module to guide the pruning work, and determines that the pruning work removes the unimportant parts of the model. Finally, this paper compares the pruned model with other methods in terms of accuracy, Floating Point Operations(FLOPs) and model parameters on ImageNet-1k.

Original languageEnglish
Title of host publication2022 21st International Symposium on Communications and Information Technologies, ISCIT 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages210-215
Number of pages6
ISBN (Electronic)9781665498517
DOIs
Publication statusPublished - 2022
Event21st International Symposium on Communications and Information Technologies, ISCIT 2022 - Xi'an, China
Duration: 27 Sept 202230 Sept 2022

Publication series

Name2022 21st International Symposium on Communications and Information Technologies, ISCIT 2022

Conference

Conference21st International Symposium on Communications and Information Technologies, ISCIT 2022
Country/TerritoryChina
CityXi'an
Period27/09/2230/09/22

Keywords

  • knowledge distillation
  • network pruning
  • transformer pruning
  • vision transformer

Fingerprint

Dive into the research topics of 'A Knowledge- Distillation - Integrated Pruning Method for Vision Transformer'. Together they form a unique fingerprint.

Cite this