基于情感词典与LDA模型的股市文本情感分析
DOI:
CSTR:
作者:
作者单位:

上海大学通信与信息工程学院 上海 200444

作者简介:

通讯作者:

中图分类号:

TN9

基金项目:


Stock text sentiment analysis based on emotion dictionary and LDA model
Author:
Affiliation:

School of Communication and Information Engineering,Shanghai University, Shanghai 200444, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    建立了一种基于股票情感词典与LDA分析股票文本情感倾向的模型。针对股票文本情感分析中情感词典不全面与句子分析片面的问题,构建较为全面的股票情感词典,同时以句子的倾向性、程度性与相关性三方面分析股票文本情感。引入针对股票的词语、程度性词语与转折性词语构建较为全面的情感词典;抽取预处理之后的股票文本句子的情感词;利用句子算法计算句子倾向、程度向量,并对句子向量利用支持向量机(SVM)和K均值算法分类;利用LDA(latent dirichlet allocation)对情感词计算文档主题、文档词语概率分布,以此概率分布获取句子的相关性;综合句子的倾向性、程度性、相关性计算句子情感;最后,通过句子情感获取股票文本的情感倾向比例。通过对百度新闻经济板块收集的股票文本进行实验并与其他算法比较,该模型对句子与文本分类准确率提高到82.78%与84.14%。

    Abstract:

    This paper improvesan analysis model ofstock text sentiment orientation based on stock emotion dictionary and LDA. For the problem of the incomplete stock dictionary and the unilateral analysing of sentence, this paper constructs a relatively complete stock emotional dictionary, and analyses the emotional tendency of stock text from the three aspects of the tendency, the degree and the correlation of sentence. This paper builds a more complete emotional dictionary by introducing the stock words,the degree words and the turning words. Then it extracts the sentiment words from stock text sentences after the processing. It educes the sentence tendency and degree vector from the emotional words in the sentence by the sentence algorithm, and uses SVM and K mean algorithm to classify sentence vector. The paper gets the words distribution of the topic and the topic distribution of document from the sentiment words by LDA model, and obtains the correlation of the sentence by this probability distribution. Finally it synthesizes sentence tendency, degree, correlation to obtain the sentence emotion, and acquires the emotional tendency of stock text through the sentence emotion. At last, This paper collects the text of Baidu news in the sector of the economy as the experimental material, and does experiment and compares with other algorithms. Experimental results show that the accuracy of sentence and article classification are 82.78% and 84.14%.

    参考文献
    相似文献
    引证文献
引用本文

延丰,杜腾飞,毛建华,刘学锋.基于情感词典与LDA模型的股市文本情感分析[J].电子测量技术,2017,40(12):82-87

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2018-01-30
  • 出版日期:
文章二维码