基于LMD改进特征提取的三路病理语音识别
DOI:
CSTR:
作者:
作者单位:

中北大学信息与通信工程学院 太原 030024

作者简介:

通讯作者:

中图分类号:

TN912.34; R741

基金项目:

山西省基础研究计划项目(202203021221103)资助


Three channel pathological speech recognition based on LMD improved feature extraction
Author:
Affiliation:

School of Information and Communication Engineering, North University of China,Taiyuan 030024, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对发音障碍患者发音不够清晰准确,导致病理语音识别率低的问题,提出一种基于LMD改进的Gammatone滤波器组图谱特征提取算法进行三路病理语音识别,首先,该算法采用LMD分解语音信号,对分解后的各语音分量做短时傅里叶变换后进行频率合成,提取滤波器组特征及其一阶、二阶差分特征,构成能获取病理语音有效局部特征的LMD-GFbank图谱特征;其次,为了进一步优化网络模型在训练过程中遗漏掉部分有效特征信息,提出一种三路病理语音识别模型;最后,结合语音特征信息进行病理语音识别模型训练和测试。实验结果表明,LMD-GFbank图谱特征在三路病理语音识别模型上的识别率达到了93.36%,优于传统MFCC、GFCC、Fbank特征的语音识别效果,验证了所提算法及识别模型能提升病理语音识别准确率。

    Abstract:

    Aiming at the problem that patients with dysphonia lack clear and accurate pronunciation, which leads to low pathological speech recognition rate, an improved Gammatone Filter Bank map feature extraction algorithm based on LMD is proposed for three channel pathological speech recognition. Firstly, the algorithm uses LMD to decompose speech signals, performs short-time Fourier transform on each decomposed speech component, and synthesizes frequency to extract filter bank features and their first-order and second-order differential features, forming LMD-GFbank map features that can obtain effective local features of pathological speech. Secondly, in order to further improve the problem that the network model will miss some effective feature information during the training process, a three-way pathological speech recognition model is proposed. Finally, the pathological speech recognition model is trained and tested by combining the speech feature information. The experimental results show that the recognition rate of LMD-GFbank map features on the three channel pathological speech recognition model reaches 93.36%, which is better than the speech recognition performance of traditional MFCC, GFCC, and Fbank features, and verified that the proposed algorithm and recognition model can improve the accuracy of pathological speech recognition.

    参考文献
    相似文献
    引证文献
引用本文

张楠,陈媛媛,陈鑫钰,侯懿桃.基于LMD改进特征提取的三路病理语音识别[J].电子测量技术,2024,47(12):140-147

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-11-04
  • 出版日期:
文章二维码