基于改进的Transformer细粒度图像识别算法研究
DOI:
CSTR:
作者:
作者单位:

河南理工大学电气工程与自动化学院 焦作 454000

作者简介:

通讯作者:

中图分类号:

TP391

基金项目:

河南省科技攻关项目(222102210230)、河南理工大学博士基金(B2018-33)项目资助


Research on improved Transformer fine-grained image recognition algorithm
Author:
Affiliation:

School of Electrical Engineering and Automation,Henan Polytechnic University,Jiaozuo 454000,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对细粒度图像识别存在类间差异小、难以区分等问题,本文通过提升网络对图像细节特征的表达能力,来改善这一问题。为此,设计了一种基于改进的Transformer细粒度识别算法。首先,可变形卷积令牌嵌入通过自适应调整采样点的位置,来改变卷积操作范围及其卷积核的形状,从而增强网络模型对空间信息的感知能力,以获取更为精准的空间信息;其次,高效相关通道注意力机制通过对通道的自动选择,将通道注意力的计算从通道相邻转换成语义相似,来捕获语义相似的通道信息。而精准的空间信息和语义相似的通道信息将有效提升网络模型局部特征感知能力。实验结果表明,与基线算法相比,本文方法在CUB-200-2011、Stanford Cars和Stanford Dogs三个数据集上的识别结果分别提升了1.5%、2.4%、1.5%。结果表明,本文提出的方法通过提升细粒度图像细节特征的表达能力,从而有效提高了细粒度图像识别的有效性。

    Abstract:

    To address the issues of small inter-class differences and difficulty in distinguishing fine-grained images, this paper proposes a method that improves the network’s ability to express image detail features, aiming to alleviate this problem. To achieve this, an improved Transformer-based algorithm for finegrained recognition is designed in this study. Firstly, deformable convolutional token embedding adjusts the sampling points adaptively to modify the convolution operation range and the shape of its kernel, enhancing the network’s perception of spatial information for more accurate spatial details. Secondly, an efficient correlation channel attention mechanism automatically selects channels to transform the computation from neighboring channels to semantically similar channels, capturing semantic-related channel information. The precise spatial information and semantically related channel information effectively enhance the network’s perception of local features. Experimental results demonstrate that compared to the baseline algorithms, the proposed method improves recognition results by 1.5%, 2.4%, and 1.5% respectively on the CUB-200-2011, Stanford Cars, and Stanford Dogs datasets. These results indicate that the proposed approach effectively enhances the effectiveness of fine-grained image recognition by improving the expression capability of image detail features.

    参考文献
    相似文献
    引证文献
引用本文

李冰锋,刘帅,杨艺.基于改进的Transformer细粒度图像识别算法研究[J].电子测量技术,2024,47(2):114-120

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-04-30
  • 出版日期:
文章二维码