Abstract:In order to solve the problems of low accuracy and slow detection speed of obstacle detection in the complex rail transit background, an improved object detection network model of YOLOv5 was proposed. Firstly, a lightweight Transformer backbone EMO based on attention mechanism was used to replace some modules in the original backbone of YOLOv5, which not only ensured the lightweight, but also improved the accuracy and stability of the model. Secondly, Focal-EIoU is used to replace the CIoU loss function in YOLOv5 to solve the problems of low training efficiency and slow convergence speed caused by CIoU. Finally, the lightweight upsampling operator CARAFE is used to replace the original upsampling layer in the YOLOv5 algorithm, which has a larger receptive field without introducing too many parameters and computational cost, and improves the detection accuracy and detection speed. Experimental results show that compared with the original YOLOv5 network model, the mean average precision of the proposed method is improved by 11.1%, the precision is improved by 13%, the recall is improved by 11.4%, and the detection speed reaches 60.7 frames per second. The proposed method shows good performance in the target detection task, and effectively enhances the detection performance of the target detection model in the context of rail transit.