Abstract:Aiming at the problem that the traditional speech enhancement network is not ideal for unknown noise enhancement, this paper proposes an improved method from the aspects of spectral enhancement, network structure and feature fusion mechanism. Firstly, in order to extract the deep feature information of the spectrum, VGG19 structure was used to replace the encoder part of UNet structure, and residual network was added to the decoder part to deepen the network depth and prevent the training degradation. Secondly, in order to better combine the feature information in the spectrogram, an adaptive feature fusion mechanism is added to the jump connection part of THE UNet structure to fuse the deep and shallow features. In addition, in order to enhance the speaker information, the histogram equalization algorithm is used to optimize the feature of the spectrogram, and the histogram equalization enhancement spectrogram is obtained. In different noise environments, the proposed method outperforms other enhancement methods in terms of quality and comprehensibility.