时空图卷积网络的骨架识别硬件加速器设计

首页 > 过刊浏览>2024年第47卷第11期 >36-43

时空图卷积网络的骨架识别硬件加速器设计
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        谭会生谭会生
湖南工业大学轨道交通学院 株洲 412000
在期刊界中查找
在百度中查找
在本站中查找
严舒琪严舒琪
湖南工业大学轨道交通学院 株洲 412000
在期刊界中查找
在百度中查找
在本站中查找
杨威杨威
湖南工业大学轨道交通学院 株洲 412000
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:湖南工业大学轨道交通学院 株洲 412000
作者简介:
通讯作者:
中图分类号:TN791
基金项目:湖南省学位与研究生教学改革研究项目(2022JGYB183)资助

Hardware accelerator design for skeleton recognition in spatio-temporal graph convolutional networks

Author:

Tan Huisheng
Tan Huisheng
College of Railway Transportation, Hunan University of Technology, Zhuzhou 412000, China
在期刊界中查找
在百度中查找
在本站中查找
Yan Shuqi
Yan Shuqi
College of Railway Transportation, Hunan University of Technology, Zhuzhou 412000, China
在期刊界中查找
在百度中查找
在本站中查找
Yang Wei
Yang Wei
College of Railway Transportation, Hunan University of Technology, Zhuzhou 412000, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

College of Railway Transportation, Hunan University of Technology, Zhuzhou 412000, China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

随着人工智能技术的不断发展，神经网络的数据规模逐渐扩大，神经网络的计算量也迅速攀升。为了减少时空图卷积神经网络的计算量，降低硬件实现的资源消耗，提升人体骨架识别时空图卷积神经网络(ST-GCN)实际应用系统的处理速度，利用现场可编程门阵列(FPGA)，设计开发了一个基于时空图卷积神经网络的骨架识别硬件加速器。通过对原网络模型进行结构优化与数据量化，减少了FPGA实现约75%的计算量；利用邻接矩阵稀疏性的特点，提出了一种稀疏性矩阵乘加运算的优化方法，减少了约60%的乘法器资源消耗。经过对人体骨架识别实验验证，结果表明，在时钟频率100 MHz下，相较于CPU，FPGA加速ST-GCN单元，加速比达到30.53；FPGA加速人体骨架识别，加速比达到6.86。

关键词:人体骨架识别;时空图卷积神经网络(ST-GCN);硬件加速器;现场可编程门阵列(FPGA);稀疏矩阵乘加运算硬件优化

Abstract:

With the continuous advancement of artificial intelligence technology, the scale of data in neural networks is gradually expanding, leading to a rapid increase in computational complexity. In order to reduce the computational load of SpatioTemporal Graph Convolutional Neural Networks (ST-GCN), decrease hardware resource consumption, and improve processing speed in practical applications of human skeleton recognition systems, a hardware accelerator based on ST-GCN was designed and developed using Field Programmable Gate Arrays (FPGA). By optimizing the structure of the original network model and quantifying the data, the computational load of FPGA implementation is reduced by about 75%. Based on the sparsity of adjacency matrix, an optimization method for multiplicative and additive operation of sparsity matrix is proposed, which reduces the multiplier resource consumption by about 60%. Experimental validation on human skeleton recognition demonstrated that compared to CPUs, FPGA-accelerated ST-GCN units achieved a speedup of 30.53 at a clock frequency of 100 MHz. The FPGA acceleration for human skeleton recognition achieved a speedup of 6.86.

Key words:human skeleton recognition;spatiotemporal graph convolutional neural network (ST-GCN);hardware accelerator;field programmable gate array (FPGA);hardware optimization of sparse matrix multiplication and addition

引用本文

谭会生,严舒琪,杨威.时空图卷积网络的骨架识别硬件加速器设计[J].电子测量技术,2024,47(11):36-43

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2024-10-12
出版日期:

网站首页

杂志简介

过刊浏览

投稿须知

欢迎订阅

联系我们

English

引用本文

分享

文章指标

历史

文章二维码

网站首页

杂志简介

过刊浏览

投稿须知

欢迎订阅

联系我们

English

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码