Abstract:Aiming at the problems of less training data and high labeling cost of named entity recognition in the military aircraft maintenance field. The paper proposed an improved named entity recognition method based on pre-training BERT. Firstly, learn from the idea of remote supervision, we fuse the boundary features of remote Tag word on token to get the feature fusion vector. Then the vector is sent to BERT to generate a dynamic word vector representation. Finally, the CRF model is connected to get the global optimal result of the sequence. Experiments are carried out on the self-built dataset, and the F1 value reaches 0.861. In order to compress the model parameters, the trained BERT-CRF model is used to generate pseudo label data, and the student model BiGRU-CRF with less parameters is trained in combination with knowledge distillation technology. The experimental results show that compared with the teacher model, the student model reduces 95.2% of the parameters and 47% of the reasoning time at the cost of losing 2% of the F1 value.