Abstract:In order to achieve autonomous driving in cities, it is necessary to be able to efficiently detect the on-site command gestures of traffic police. Aiming at the problems of low recognition accuracy, slow detection speed, and difficulty in dealing with complex road environments in existing gesture recognition algorithms, an improved YOLOX-tiny traffic police gesture recognition algorithm is proposed. Firstly, an improved GhostNet network was used to replace the original backbone network, and a Coordinate Attention mechanism was inserted to comprehensively extract input image features, improving the detection accuracy of the network and enhancing the detection performance for small and medium-sized targets; Secondly, the decoupling head was improved by designing the SCDE Head structure, which reduces computational complexity while filtering redundant information, making the decoupling head more efficient. The decoupling head also integrates multi-scale features, improving the accuracy of object detection; Finally, applying SIoU to localization loss accelerates network convergence and improves regression accuracy. Tested on a self-made traffic police command gesture dataset, the experimental results showed that compared with the YOLOX-tiny model, the improved algorithm reduced the number of parameters by 27.97%, the model′s computational complexity by 33.31%, and the average detection accuracy increased by 2.31%, with a 45% increase in detection speed, which is more suitable for the practical needs of autonomous driving and hardware deployment.