全局-局部特征融合的人体姿态估计算法
DOI:
作者:
作者单位:

1.大连民族大学机电工程学院;2.大连民族大学

作者简介:

通讯作者:

中图分类号:

TP391.41

基金项目:

国家自然科学基金项目(61673084);辽宁省自然科学基金项目(20170540192,20180550866,2020-MZLH-24)


Global-local features fusion in human pose estimation algorithm
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对现有人体姿态估计算法存在因骨干网络特征提取不充分,导致关键点特征信息丢失的问题,提出一种结合全局-局部特征融合模块的人体姿态估计网络模型(GLF-Net)。为了在特征提取阶段获得高质量的特征图,该算法从全局特征和局部特征出发,对骨干网络ResNet-50进行改进,分别设计了全局极化自注意力模块和局部深度可分离卷积模块。同时采用并行的结构方式将融合了全局位置信息和局部语义信息特征的模块嵌入到骨干网络的Bottleneck层中,既能增强原骨干网络的特征提取能力,又为后续的Transformer网络提供有效的全局和局部特征输入,进而提高姿态关键点检测的性能。在公开人体姿态估计数据集COCO 2017上和MPII数据集上分别进行模型测试,该算法性能与与基准算法(Poseur)相比,姿态关键点的平均准确度(AP)提升了2.1%,平均召回率(AR)提升了1.5%,正确估计关键点比例(PCKh@0.5)最高达到90.6。实验结果表明,所提算法在姿态估计精度上优于现存同类方法,可以明显提高人体姿态关键点的定位准确度。

    Abstract:

    Aiming at the problem that the existing human pose estimation algorithm has insufficient feature extraction of the backbone network, which leads to the loss of key point feature information, a human pose estimation network model ( GLF-Net ) combined with global-local feature fusion module is proposed. In order to obtain high-quality feature maps in the feature extraction stage, the algorithm improves the backbone network ResNet-50 from the global and local features, and designs a global polarization self-attention module and a local depth separable convolution module respectively. At the same time, a parallel structure is used to embed the module that combines global position information and local semantic information features into the Bottleneck layer of the backbone network, which can not only enhance the feature extraction ability of the original backbone network, but also provide effective global and local feature input for the subsequent Transformer network, thereby improving the performance of pose key point detection. The model test is carried out on the public human pose estimation dataset COCO 2017 and MPII dataset respectively. Compared with the benchmark algorithm ( Poseur ), the average accuracy ( AP ) of the pose key points is increased by 2.1 %, the average recall rate ( AR ) is increased by 1.5 %, and the proportion of correctly estimated key points ( PCKh @ 0.5 ) is up to 90.6. The experimental results show that the proposed algorithm is superior to the existing similar methods in the accuracy of pose estimation, and can significantly improve the positioning accuracy of human pose key points.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-03-28
  • 最后修改日期:2024-05-27
  • 录用日期:2024-05-28
  • 在线发布日期:
  • 出版日期: