Abstract:An improved U-Net street image semantic segmentation method based on a dual attention mechanism is proposed to improve the segmentation effect of multi-scale targets and enhance the feature extraction ability.After the fifth convolutional block in the U-Net encoding stage, the feature pyramid attention module is added to extract multi-scale features, fuse contextual information, and enhance the target semantic features.Instead of using the feature stitching method of U-Net in the decoding stage, a joint spatial domain-channel domain attention module is designed to receive the low-level feature maps from the jump connection and the high-level feature maps from the previous attention module.Experimental results on the Cityscapes dataset show that the introduced attention module can effectively improve the street view image segmentation accuracy, and the segmentation performance metric mIoU improves by 2.0~9.6 percentage points compared with methods such as PSPNet and FCN.