Research on improved Transformer fine-grained image recognition algorithm

Home > Archive>Volume 47, Issue 2, 2024 >114-120

Research on improved Transformer fine-grained image recognition algorithm
DOI:
                        
                    
CSTR:
                        [cstr]
                    
Author:
                        Li BingfengLi Bingfeng
School of Electrical Engineering and Automation，Henan Polytechnic University，Jiaozuo 454000,China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Liu ShuaiLiu Shuai
School of Electrical Engineering and Automation，Henan Polytechnic University，Jiaozuo 454000,China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Yang YiYang Yi
School of Electrical Engineering and Automation，Henan Polytechnic University，Jiaozuo 454000,China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:School of Electrical Engineering and Automation，Henan Polytechnic University，Jiaozuo 454000,China
Clc Number:TP391
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

To address the issues of small inter-class differences and difficulty in distinguishing fine-grained images, this paper proposes a method that improves the network’s ability to express image detail features, aiming to alleviate this problem. To achieve this, an improved Transformer-based algorithm for finegrained recognition is designed in this study. Firstly, deformable convolutional token embedding adjusts the sampling points adaptively to modify the convolution operation range and the shape of its kernel, enhancing the network’s perception of spatial information for more accurate spatial details. Secondly, an efficient correlation channel attention mechanism automatically selects channels to transform the computation from neighboring channels to semantically similar channels, capturing semantic-related channel information. The precise spatial information and semantically related channel information effectively enhance the network’s perception of local features. Experimental results demonstrate that compared to the baseline algorithms, the proposed method improves recognition results by 1.5%, 2.4%, and 1.5% respectively on the CUB-200-2011, Stanford Cars, and Stanford Dogs datasets. These results indicate that the proposed approach effectively enhances the effectiveness of fine-grained image recognition by improving the expression capability of image detail features.

Key words:fine-grained image recognition;Transformer;deformable convolution

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:
Revised:
Adopted:
Online: April 30,2024
Published:

Home

Introduction

Editorial Committee

Policy

Contact Us

中文版

Get Citation

Share

Article Metrics

History

Article QR Code