Abstract:As a pre-processing step for OCR, the layout segmentation technology is receiving increasing attention from both academic and industrial communities. To address the problems encountered in layout segmentation, such as slow detection speed, inaccurate boundary detection of target areas, and easy omission of small targets, the YOLOv7-MSY model is proposed. Firstly, the Multi-WHFPN network structure is proposed by combining the idea of residual connection, and trainable weighted parameters are introduced to highlight the importance of features and add a small target detection head to enhance small target detection. Secondly, the SimAM attention mechanism is introduced to evaluate feature weights in the 3D dimension without adding extra parameters, to enhance important features and suppress invalid features. Finally, the YEIOU is used to replace the original model′s localization loss function, which improves the convergence speed and regression accuracy of the model. Experimental comparisons on the dataset provided by the Jiangsu Provincial Archives show that YOLOv7-MSY is more sensitive to boundary detection of target areas and performs better in detecting small targets. The mAP@.5 of YOLOv7-MSY reaches 0.871, which is 7.84% higher than the original YOLOv7 model. The layout segmentation effect of this model is superior to other types of layout segmentation algorithms. It has good generalization performance,and the layout segmentation speed is relatively high.