Abstract:In view of the difficulties in labeling the data samples of oil immersed transformers, the small amount of labeled samples and the low accuracy of traditional fault diagnosis methods, a twolayer fault diagnosis model for oil immersed transformers with few labels based on GBDT and Kmeans gain clustering is proposed. Firstly, a stacked autoencoder is used to reduce the dimension of the highdimensional characteristic gas characterizing the transformer state, remove redundant information, and obtain the lowdimensional characteristic vector containing the transformer operating state as the input of the subsequent classifier. Secondly, a twolayer fault diagnosis model is constructed; For unlabeled samples, the GBDT method is introduced as the first layer of the proposed model to obtain the false labels of unlabeled samples. In order to further improve the diagnosis accuracy, the Kmeans clustering gain based on the false label of unlabeled samples is proposed as a new feature vector, which is input into the end layer model Kmeans to realize the secondary diagnosis. Experimental analysis shows that the proposed method can effectively improve the accuracy of transformer fault diagnosis under the condition of few tags, and the diagnosis accuracy is improved by at least 6% compared with other methods. It provides a new idea for fault diagnosis of oil immersed transformer with few labels.