Using 14 months of SCADA data from a certain wind turbine unit as the research subject, considering the complexity of traditional Transformer models and the problem of multiple model parameter settings, a Transformer model is constructed by introducing a linear decoder structure to study the fault prediction method for wind turbine units based on SCADA data. The research shows that the algorithm model constructed has long-term stability, can eliminate the phenomenon of false predictions, and can make fault predictions six days in advance, providing a guarantee to avoid sudden shutdown caused by fault deterioration.