Abstract:To address issues such as confusion in distinguishing left-right and front-back categories due to similar features in current vehicle orientation scene recognition tasks, we proposed a vehicle orientation scene recognition method that integrates global-local attention. We introduced the concept of multi-view vehicle scenes, utilized OSMNet for feature extraction and scene classification, and developed a global-local attention module to focus on key areas across different orientation scenes for effective spatial orientation learning. Additionally, we designed a global-local positional attention module to address overlapping class distances between certain vehicle orientation scenes.Experiments on an 8-class scene dataset demonstrated that our D-CBAM and HGLP modules effectively enhanced the capture of global and local information in feature maps, improving model recognition accuracy by 3.54% and 4.22%, respectively, in ablation studies. Comparative experiments showed that our model achieved an accuracy of 95.49%, which is 5.46% higher than the baseline model. Overall, our model outperformed other classification models in recognizing most orientations better than the baseline model. These results demonstrate that our improved classification model effectively learns vehicle orientation information, bridging the gap for matching images from distant, intermediate, and near perspectives, and laying a foundation for tasks such as multi-part vehicle detection and segmentation.