Abstract:In order to solve the problems of low recognition accuracy and poor real-time performance of traffic police gestures in the environment of uneven illumination and complex background. In this paper, based on the yolov5 network model, part of the convolution layers are replaced by self-calibrated convolutions to increase the range of the receptive field. Shuffle attention module is introduced to improve the feature extraction ability of the algorithm. Aiming at the complex and changeable environment of traffic police, the Focal loss function was replaced by Generalized Focal loss function to improve the expression ability of target frame in complex environment. Experimental results show that on the basis of real-time performance, the average accuracy of the improved algorithm for traffic police gesture detection is as high as 98.54%, which is 3.39% higher than that of the unimproved algorithm, and the loss value of the loss function is smaller.