具有显著提高准确率和鲁棒性的基于极限学习机的流量分类
DOI:
CSTR:
作者:
作者单位:

上海大学 特种光纤与光接入网省部共建重点实验室上海200072

作者简介:

通讯作者:

中图分类号:

TN915

基金项目:

国家自然科学基金重点 (61420106011) 、上海市重点学科(15511105400)资助项目


Extreme learning machine based traffic classification with significant improvement on accuracy and robustness
Author:
Affiliation:

Key Laboratory of Specialty Fiber Optics and Optical Access Networks,Shanghai University,Shanghai 200072,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    流量分类是网络管理员进行网络流量监控从而实现有效管理的重要手段。因此,准确地对流量进行分类具有重要意义。流量分类的两个重要评判标准是分类器的准确率和效率。本文提出了一种准确率高、鲁棒性强的流量分类方案,该方案第一次将最近几年提出的一种新的机器学习算法极限学习机引入网络流量分类领域进行研究并进行针对性优化。同时也提出了一种自动生成流量分类器训练集的方案,使该系统对新的网络应用具有更强的自适应性和扩展性。本文使用VoIP和WWW流量作为流量分类的两个类别。实验结果表明该方案相比其他文献提出的C4.5,RandomForest,NaiveBayes和KNN具有更高的准确率、稳定性和鲁棒性。其中当测试数据集在训练数据集后当天收集时,本文分类器具有93%的高准确率,其他算法具有类似的准确率;当测试数据集在训练数据集后1月和2月收集时,本文分类器仍保持85%的高准确率,而其它算法的准确率只有大概60%左右,具有较大偏差无法应用到实际的流量分类系统中。实验结果表明,提出的流量分类方案具有准确率高,鲁棒性和扩展性强,可应用到流量分类实践中。

    Abstract:

    Network traffic classification plays an important role for network administrators to supervise the traffic flows in order to manage the network efficiently. Therefore, accurate classification of traffic flow is of great significance. The quality of traffic classification lies in the classifier’s accuracy and efficiency. This paper firstly implements an accurate and robust traffic classification solution using a recent new machine learning algorithm “Extreme Learning Machine”. This paper also proposes a way to automatically generating training dataset for traffic classifier, making the system adaptable and scalable for new network applications. In this paper, VoIP and WWW traffic are used as two categories for the traffic classification. Experiment results indicate that this solution is highly accurate, more stable and robust for the classification of our traffic flow samples compared with other methods such as C4.5, Random Forest, Na?ve Bayes and KNN sited in other literatures. The classifier proposed in this paper has 93% accuracy when test dataset collected after training dataset, and other algorithms have similar accuracy. When test dataset collected one and two months later than training dataset, it still keep 85% high accuracy while other algorithm only reach about 60% at most, whose accuracy deviation are too large to apply to the practical traffic classification system. Thus, it is quite accurate and robust, with great scalable for engineering practice.

    参考文献
    相似文献
    引证文献
引用本文

施燕,陈荣荣,刘亚帆,顿涵.具有显著提高准确率和鲁棒性的基于极限学习机的流量分类[J].电子测量技术,2016,39(8):53-57

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2016-09-23
  • 出版日期:
文章二维码