2023, 46(12):134-142.
Abstract:In road traffic motorcycle accidents, failure to wear a helmet is the leading cause of fatal injuries to riders. Aiming at the problems of false detection and missed detection in the current helmet detection due to the similarity in color and shape of black hair, hat and helmet, a motorcycle helmet detection algorithm with triplet attention mechanism and bidirectional cross-scale feature fusion is proposed. First, a triplet attention mechanism is introduced into the backbone network of YOLOV5s, which extracts semantic dependencies between different dimensions, eliminates the indirect correspondence between channels and weights, and improves detection accuracy by paying attention to the differences between similar samples. Second, the EIOU bounding loss function is used to optimize the detection effect of occluded and overlapping objects. Finally, the weighted bidirectional feature pyramid network structure is adopted in the feature pyramid to achieve efficient bidirectional cross-scale connection and weighted feature fusion, which enhances the network feature extraction capability. The experimental results show that the improved algorithm achieves 98.7% mAP@0.5 and 94.0% mAP@0.5:0.95. Compared with the original algorithm, the improved algorithm′s mAP@0.5 increases by 3.9% and mAP@0.5:0.95 increases by 7.6%, with higher accuracy and stronger generalization ability.
2022, 45(4):91-98.
Abstract:Aiming at the problems of deep learning-based gesture recognition model with large parameters, slow training speed and high equipment requirements, which increase the cost, a gesture recognition and detection algorithm based on lightweight convolutional neural network is proposed. First, use the Ghost module to design a lightweight backbone feature extraction network to reduce the amount of parameters and calculations of the network; improve the feature fusion network by introducing a weighted two-way feature pyramid network to improve the network detection accuracy; finally use the CIoU loss function as the bounding box regression loss function And add Mosaic data enhancement technology to speed up model convergence and improve the robustness of the network. Experimental results show that the size of the improved model is only 17.9M, which is 92.4% smaller than the original YOLOv3 model, and the average accuracy is increased by 0.6%. Therefore, the new detection method can not only reduce the amount of model parameters, but also ensure the accuracy and efficiency of the model, providing a theoretical reference for gesture recognition and detection.