Abstract:Aiming at the problems that the features of small targets in remote sensing images are easily lost, easily affected by background noise and difficult to locate, this paper improves the YOLOX-S object detection model. Using two-dimensional discrete cosine transform to improve CBAM (Convolutional Block Attention Module) and add the improved CBAM to the backbone network, thus improving the awareness of network for small target; Secondly, a weighted multi-receptive field space pyramid pool module is proposed to improve the perception ability of multi-scale targets, especially small scale targets. Furthermore, a cross-layer attention fusion module is proposed based on the idea of cross-layer feature fusion, so that more features of small targets can be preserved in the deep structure. Finally, EIoU (Efficient Intersection over Union) loss is used to strengthen the ability of small target. According to a large number of experimental analysis, compared with the benchmark model, the APs value of the improved model is increased by 5.1% in the RSOD dataset, 2.4% in the DIOR dataset, the number of parameters is only increased by 1.01M, and the detection speed reaches 93.6 frame·s-1, meeting the real-time detection requirements. In addition, compared with the latest target detection model, the improved model also has some advantages.