Abstract:The appearance characteristics of different models are highly similar and different from those of the same model, which poses a great challenge to the feature extraction network.Existing vehicle type classification schemes only rely on vehicle appearance feature recognition, and the overall recognition accuracy is not high.Therefore,firstly, this paper designs a multi level attention mechanism in backbone network to improve the ability of main network to extract and recognize vehicle features. Secondly, according to the changes of vehicle appearance characteristics at different vehicle locations in the bayonet environment, a feature fusion structure of vehicle location and appearance features is proposed, which extracts the composite image features of the fusion location, reduces the feature distance within the class, and enhances the expressiveness and robustness of the features extracted by the main network.Finally, based on the analysis of the attention heat map of difficult samples, the attention area of difficult samples is intervened to make the network focus on the local area of small differences between vehicles. The experimental results show that the overall performance of the vehicle type recognition method proposed in this paper is significantly improved than the existing scheme.