Abstract:Efficient and accurate detection of small targets in dense scenes is a key problem in the field of target detection. In order to solve the problems of diversity of environments and complexity of small targets, such as difficult feature extraction and low detection accuracy, a small target detection method for dense scenes combined with TCYOLOX is proposed. Firstly, by introducing Transformer Encode module into CSPNet, the target weight is continuously updated to enhance the target feature information and improve the network feature extraction capability. Secondly, the convolutional attention mechanism module is added to the feature pyramid network to focus on important features and suppress unnecessary features, so as to improve the detection accuracy of targets of different scales. Then, CIoU is used to replace IoU as the regression loss function, which makes the network converge faster and has better performance in the process of model training. Finally, it is verified on PASCAL VOC 2007 dataset. The experimental results show that the designed TCYOLOX model can effectively detect small target objects under normal, dense, sparse and dark conditions in diversified scenes. The mAP and detection speed can reach 946% and 38 fps, which is 109% and 1 fps higher than the original model. It has good applicability to small target detection tasks in multiple dense scenes.