Abstract:Aiming at the problems that small target features in remote sensing images are easily lost, easily affected by background noise and difficult to locate, this paper improves the YOLOXS target detection model. Firstly, the CBAM is improved by using the twodimensional discrete cosine transform and added to the backbone network to improve the attention of the network to small targets; secondly, a weighted multireceptive spatial pyramid pooling module is proposed to improve the perception ability of the model to multiscale targets, especially to smallscale targets. Thirdly, using the idea of crosslayer feature fusion, a crosslayer attention fusion module is proposed to retain as many small target features as possible in the deep structure; finally, EIoU loss is used to enhance the localization ability of small targets. As shown by extensive experimental analysis, the APs value of the improved model improves by 51% relative to the baseline model on the RSOD dataset and by 24% on the DIOR dataset, and the number of parameters increases by only 1.01 M. The detection speed reaches 93.6 fps, which meets the detection requirements of real-time. In addition, the improved model in this paper also has certain advantages over the current stateoftheart target detection models.