基于双字典类标签语言模型的电力调度语音识别
DOI:
CSTR:
作者:
作者单位:

1.北京中电飞华通信有限公司 北京 100070 2.华北电力大学 电子与通信工程系 保定 071000

作者简介:

通讯作者:

中图分类号:

TP391

基金项目:

国家自然科学基金资助项目(61771195)


Power dispatching speech recognition based on double dictionary class label language model
Author:
Affiliation:

1.Beijing FibrLink Communications Co.,Ltd., Beijing, 100070,China; 2.Department of Electronic and Communication Engineering, North China Electric Power University, Baoding, 071000, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    语言模型的效果关系到电力调度语音识别系统的识别准确性。为了提高电力调度语音识别的精度,文章提出一种基于双字典(通用字典和电力调度领域词字典)的类标签语言模型,该模型以n-gram语言模型为基础加以改进并添加类标签信息,进而提升电力调度语音识别的准确性;同时提出了一种基于双字典的分词、词性标注的联合系统,用于语料的分词、类标签标注任务,从而提高基于双字典的类标签语言模型对电力调度语言的适应性。最后,在采集到的电力调度指令集上对本文所提语言模型和常用统计语言模型进行了对比实验。此外还通过实验对联合系统和其他分词、词性标注系统进行了对比。仿真结果表明联合系统的分词、词性标注效率更高。考虑了语义信息、字典、分词和词性标注系统等综合因素的类标签语言模型在电力调度语言识别的准确率更高,词错误率只有4.14%。

    Abstract:

    The accuracy of power dispatching speech recognition system is related to the effect of language model. In order to improve the accuracy of power dispatching speech recognition, this paper proposes a class label language model based on double dictionaries (general dictionary and power dispatching domain word dictionary). The model improves the n-gram language model and adds class label information, so as to improve the accuracy of power dispatching speech recognition. At the same time, a joint method of word segmentation and part of speech tagging based on double dictionaries is proposed. The system is used for word segmentation and class label labeling of corpus, and then improves the adaptability of class label language model based on double dictionary to power dispatching language. Finally, the comparison experiments between the proposed language model and the common statistical language models are carried out on the collected command set of power dispatching. In addition, the joint system and other word segmentation and part of speech tagging systems are compared by experiments. The simulation results show that the efficiency of word segmentation and part of speech tagging is higher in the joint system. Considering the comprehensive factors of semantic information, dictionaries, word segmentation and part of speech tagging system, the error rate of the proposed model in power dispatching language recognition is only 4.14%.

    参考文献
    相似文献
    引证文献
引用本文

赵晴,李庭瑞,罗睿,李锐,韩天宇,韩东升.基于双字典类标签语言模型的电力调度语音识别[J].电子测量技术,2021,44(13):121-126

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-09-05
  • 出版日期:
文章二维码