Abstract:As one of the most common diseases of roads, the timely and accurate identification and localization of cracks is of great significance to the maintenance and continuous healthy operation of roads. However, the detection of pavement cracks is easily affected by complex factors such as road illumination, road shadows, and road environment, which leads to low segmentation accuracy of pavement cracks and prone to fracture and other problems. In order to realize fast and accurate semantic segmentation of pavement crack images, this paper proposes a pavement crack segmentation model based on pixel intensity order transform (PIOT) and UNetFormer. Firstly, the PIOT algorithm is used to preprocess the crack images, and according to the intensity order between each pixel and its neighboring pixels, the image is converted into a four-channel image with higher contrast along the four directions of the diagonal, which retains the intrinsic features of the crack curve structure and effectively enhances the contrast between the cracks and the background pixels. Then, based on the structural characteristics of UNet and Transformer networks, the high-precision segmentation of pavement cracks is accomplished by constructing the UNetFormer segmentation model, in which the global-local attention mechanism is designed and invoked to fully capture the pavement crack feature information. Finally, three open-source datasets, CFD, Crack200 and Crack500, are used for example validation, and the experimental results show that the F1-score of the crack segmentation model proposed in this paper reaches 83.4%, 82.6%, and 81.9%, respectively, and the model parameter is only 37.7% of that of the UNet network model, which provides higher segmentation accuracy compared to the existing crack segmentation model and stronger generalization ability than the existing crack segmentation models.