一种基于深度学习的数据无损压缩方法及在测井大规模数据存储中的应用
DOI:
CSTR:
作者:
作者单位:

1.中国石油集团测井有限公司长庆分公司 陕西 西安 710201 2.中国石油集团测井有限公司辽河分公司 辽宁 盘锦 124000

作者简介:

通讯作者:

中图分类号:

TP391.41

基金项目:


A Data Lossless Compression Method Based on Deep Learning and Application in Storing of Large Well Logging Data
Author:
Affiliation:

1.CNPC Logging Co.Ltd Changqing Branch, Xi’an, Shanxi 710201 China 2. CNPC Logging Co.Ltd Liaohe Branch, Panjin, Liaoning 124000 China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为解决海量测井数据归档存储场景中出现的存储硬件和数据库存储容量限制等问题,本文提出了一种基于深度学习的数据无损压缩方法,采用循环神经网络RNN作为概率预测器,输出数据流的条件概率分布,结合当前字节值,使用自适应算术编码器对数据流进行压缩;解压过程中,使用保存的RNN网络权重和算术解码器,对数据流进行解压。本文方法较传统无损压缩方法,在一维测井数据的实际压缩测试中压缩率平均提升约23%,在二维阵列测井数据上,平均提升约21%。同时,本文结合数据无损压缩方法,提出一种基于多维特征索引查询树结构的测井大型存储数据库的构建方法,在多条件组合查询时,较传统数据库查询方法检索效率平均提升约45%。结果表明,本文方法可有效减少测井数据的存储空间,降低数据的检索时间,为大规模测井数据的存储和利用提供了技术基础,节约数据归档的硬件成本和人力成本。

    Abstract:

    To solve the problems of storing hardware and volume of database limitation in the archiving and storing scenario of massive well logging data, a lossless data compression method based on deep learning is proposed in this paper. Data stream is compressed by adaptive arithmetic encoder combining current byte with outputted conditional probability distribution of data stream by recurrent neural network RNN as probability predictor. Data stream is decompressed by saved weights of RNN and arithmetic decoder. Compared with traditional lossless compression methods, compression ratio of one dimensional log data is improved by about 23% averagely and that of two dimensional array log data is improved by about 21% averagely in actual compression test. At the same time, A large scale well logging data storing database constructing method based on multi dimensional featuring indexes querying tree structure is proposed combined with lossless compression method in this paper whose querying efficiency is improved by about 45% compared with traditional database querying method under the multi conditions combining query. Results show that storing space of logging data is decreased effectively with lower data querying time by the method in this paper which provides technical base for storing and utilization of large volume logging data and saves hardware cost and labor cost of data archiving.

    参考文献
    相似文献
    引证文献
引用本文

陈建华,高虎,苏治俭,樊举.一种基于深度学习的数据无损压缩方法及在测井大规模数据存储中的应用[J].电子测量技术,2021,44(5):87-93

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-10-24
  • 出版日期:
文章二维码