A Transformer Architecture with Adaptive Attention for Fine-Grained Visual Classification

Changli Cai, Tiankui Zhang, Zhewei Weng, Chunyan Feng, Yapeng Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

The fine-grained visual classification (FGVC) problem is to classify different subclasses in same superclass. Due to the similarity between subclasses, the problem requires capturing fine-grained discriminative features. Although current approaches are able to extract more fine-grained features by designing complex feature extraction modules, the excessive focus on discriminative features results in ignoring massive global feature information and reducing the ability of resisting background noise. This paper propose a transformer architecture based on vision transformer (ViT) with adaptive attention (TransAA). To optimize the attention of ViT, we design two modules. An attention-weakening module is designed to enforce the model to capture more feature information, and an attention-enhancement module is designed to enhance the extraction ability of the critical features. Otherwise, we introduce a sample weighting loss function in the training process to adaptively adjust both weakening and enhancement processes. The performance of the TransAA is demonstrated on three benchmark fine-grained datasets.

Original languageEnglish
Title of host publication2021 7th International Conference on Computer and Communications, ICCC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages863-867
Number of pages5
ISBN (Electronic)9781665409506
DOIs
Publication statusPublished - 2021
Event7th International Conference on Computer and Communications, ICCC 2021 - Chengdu, China
Duration: 10 Dec 202113 Dec 2021

Publication series

Name2021 7th International Conference on Computer and Communications, ICCC 2021

Conference

Conference7th International Conference on Computer and Communications, ICCC 2021
Country/TerritoryChina
CityChengdu
Period10/12/2113/12/21

Keywords

  • adaptive attention
  • fine-grained visual classification
  • vision transformer

Fingerprint

Dive into the research topics of 'A Transformer Architecture with Adaptive Attention for Fine-Grained Visual Classification'. Together they form a unique fingerprint.

Cite this