• Article
  • | |
  • Metrics
  • |
  • Reference [21]
  • | |
  • Cited by
  • | |
  • Comments
    Abstract:

    Aiming at the problems of low feature extraction capability and scale diversity in UAV aerial images, an improved YOLOv8 object detection algorithm for UAV aerial images is proposed. Firstly, P2 layer is added to enhance the small target detection capability of the model. Secondly, the bidirectional feature alignment fusion method is designed to improve the neck. Combining the idea of feature alignment module and bidirectional feature pyramid, the multi-scale fusion capability of the model is improved to achieve a more complete feature fusion. Then, bi-level routing-spatial attention module is designed and added to the backbone. By connecting the bi-level routing attention module and spatial attention module, the feature capturing ability of the target is strengthened. Finally, the loss function Focaler-XIoU is designed to solve the influence of sample difficulty distribution on border regression, and enhance the stability and detection effect of the model. The experimental results show that the improved network model has improved the VisDrone dataset mAP50 by 9.2%, which has better detection effect than the current mainstream target detection algorithm, and can well complete the UAV aerial image detection task.

    Reference
    [1] 陈佳慧,王晓虹.改进YOLOv5的无人机航拍图像密集小目标检测算法[J].计算机工程与应用,2024,60(03):100-108.
    [2] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448
    [3] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[J]. Advances in neural information processing systems, 2015, 28.
    [4] HE K, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 42(2): 386-397.
    [5] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-time Object Detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.
    [6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot Multibox Detector[C]//European conference on computer vision. Springer, Cham, 2016: 21-37.
    [7] LI Y, FAN Q, HUANG H, et al. A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition[J]. Drones, 2023, 7(5): 304.
    [8] WANG G, CHEN Y, AN P, et al. UAV-YOLOv8: A Small-Object-Detection Model Based on Improved YOLOv8 for UAV Aerial Photography Scenarios[J]. Sensors, 2023, 23(16): 7190.
    [9] ZHANG Z. Drone-YOLO: An Efficient Neural Network Method for Target Detection in Drone Images[J]. Drones, 2023, 7(8): 526.
    [10] 张绍文,史卫亚,张世强,等.基于加权感受野和跨层融合的遥感小目标检测[J].电子测量技术,2023,46(18):129-138.
    [11] 李校林,刘大东,刘鑫满,等.改进YOLOv5的无人机航拍图像目标检测算法[J/OL].计算机工程与应用:1-13.
    [12] Wang C Y, BOCHKOVSKIY A, Liao H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 7464-7475.
    [13] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature Pyramid Networks for Object Detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117-2125.
    [14] LIU S, QI L, QIN H, et al. Path Aggregation Network for Instance Segmentation [C] // Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8759-8768.
    [15] Wang C, He W, Nie Y, et al. Gold-YOLO: Efficient object detector via gather-and-distribute mechanism[J]. Advances in Neural Information Processing Systems, 2024, 36.
    [16] CHEN J, MAI H S, LUO L, et al. Effective Feature Fusion Network in BIFPN for Small Object Detection[C]//2021 IEEE international conference on image processing (ICIP). IEEE, 2021: 699-703.
    [17] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block Attention Module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.
    [18] ZHU L, WANG X, KE Z, et al. BiFormer: Vision Transformer with Bi-Level Routing Attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 10323-10333.
    [19] ZHENG Z, WANG P, REN D, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2021, 52(8): 8574-8586.
    [20] Zhang H, Zhang S. Focaler-IoU: More Focused Intersection over Union Loss[J]. arXiv preprint arXiv:2401.10525, 2024.
    [21] DU D, ZHU P, WEN L, et al. VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 213-226.作者简介:程换新,男,博士,教授,主要研究方向为人工智能、先进控制、机器视觉。吕玉凯,男,硕士研究生。主要研究方向为计算机视觉。骆晓玲,女,博士,教授,主要研究方向为过程装备自动化的优化设计与研究。池荣虎,男,博士,教授,主要研究方向为人工智能,学习控制。
    Related
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation
Share
Article Metrics
  • Abstract:230
  • PDF: 0
  • HTML: 0
  • Cited by: 0
History
  • Received:April 22,2024
  • Revised:July 11,2024
  • Adopted:July 11,2024
Article QR Code