Abstract:This article focuses on the shortcomings of the bottom-up human pose estimation network OpenPose model with large parameters, and improves the feature extraction network and prediction network of the OpenPose model to achieve the goal of lightweight model.This article uses the ResNet18 network which has fewer parameters and higher accuracy to replace the VGG19 network in the original model.In order to reduce the amount of parameters of the network structure, we replace part of the convolution kernel in the prediction network with the deep separable convolution without losing too much recognition accuracy. Then, the human body actions are classified through the artificial neural network, and the linear module is added to the traditional nonlinear network to improve the memory and generalization ability of the network. The results show that the FPS of the lightweight OpenPose model has increased by 9% to 16% compared to the original. After 3000 iterations of the network training, the recognition accuracy of standing, sitting, walking, sitting and standing up reach 0.877, 0.835, 0.793, 0.815 and 0.808, respectively. Finally, the recognition network is applied to a real scene. According to the results, it is shown that the method in this paper runs normally in embedded devices and performs well.