A Transformer Architecture with Adaptive Attention for Fine-Grained Visual Classification

Changli Cai, Tiankui Zhang, Zhewei Weng, Chunyan Feng, Yapeng Wang

研究成果: Conference contribution同行評審

4 引文 斯高帕斯(Scopus)

摘要

The fine-grained visual classification (FGVC) problem is to classify different subclasses in same superclass. Due to the similarity between subclasses, the problem requires capturing fine-grained discriminative features. Although current approaches are able to extract more fine-grained features by designing complex feature extraction modules, the excessive focus on discriminative features results in ignoring massive global feature information and reducing the ability of resisting background noise. This paper propose a transformer architecture based on vision transformer (ViT) with adaptive attention (TransAA). To optimize the attention of ViT, we design two modules. An attention-weakening module is designed to enforce the model to capture more feature information, and an attention-enhancement module is designed to enhance the extraction ability of the critical features. Otherwise, we introduce a sample weighting loss function in the training process to adaptively adjust both weakening and enhancement processes. The performance of the TransAA is demonstrated on three benchmark fine-grained datasets.

原文English
主出版物標題2021 7th International Conference on Computer and Communications, ICCC 2021
發行者Institute of Electrical and Electronics Engineers Inc.
頁面863-867
頁數5
ISBN(電子)9781665409506
DOIs
出版狀態Published - 2021
事件7th International Conference on Computer and Communications, ICCC 2021 - Chengdu, China
持續時間: 10 12月 202113 12月 2021

出版系列

名字2021 7th International Conference on Computer and Communications, ICCC 2021

Conference

Conference7th International Conference on Computer and Communications, ICCC 2021
國家/地區China
城市Chengdu
期間10/12/2113/12/21

指紋

深入研究「A Transformer Architecture with Adaptive Attention for Fine-Grained Visual Classification」主題。共同形成了獨特的指紋。

引用此