Abstract:Due to the diverse height and angle of drone shots, the images often have complex backgrounds and mainly feature small targets. As a result, the performance of detection algorithms for these images is often poor. To address this issue, this paper presents a vehicle detection method for aerial images using an adaptive perception network. The goal is to improve the detection of small targets by focusing on two aspects: enhancing the saliency of vehicle features and improving the preservation of feature information. First, an adaptive perception feature extraction module is proposed to extract a more efficient feature representation. This module captures long-range dependencies and stronger geometric feature representations to adaptively model the shape of objects. Second, a dual-branch spatial perception downsampling module is introduced to mitigate information loss caused by down-sampling and continuous pooling. This module combines feature maps of different channels to maximize the retention of small target feature information. Next, the feature fusion network incorporates shallow feature maps with rich spatial information and adds detection heads to enhance the detection capability of small targets. Finally, a new dynamic regression loss function, DEloU, is designed. This function includes a penalty term to measure the correlation between the aspect ratio of the ground truth box and the detection box, further improving the prediction accuracy of the network. Experimental results on the Visdrone dataset show that the proposed method achieves an average precision (mAP) of 69.9% and an inference speed of 99.26FPS, indicating a good balance between speed and accuracy. Moreover, the proposed method has achieved the best detection accuracy on the UCAS-AOD dataset and has strong generalization ability.