基于深度学习的VVC快速帧内模式决策
DOI:
CSTR:
作者:
作者单位:

上海海事大学 信息工程学院,上海 201306

作者简介:

通讯作者:

中图分类号:

TP919.81

基金项目:

国家自然科学基金资助项目(61902239)资助


VVC fast intra mode decision based on deep learning
Author:
Affiliation:

College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    新一代视频压缩标准(H.266/VVC)在帧内预测中提供了67种预测模式,这使得编码效率得到大大提升,但同时也带来了极高的计算复杂度。本文提出了一种基于深度学习的帧内模式决策快速算法。首先针对编码块尺寸划分后块的大小形状不同的问题,对提取的亮度块进行预处理,并通过随机剪裁、重采样和卷积神经网络(CNN)上采样的方式,保证块的大小和质量。然后精心设计了CNN架构来降低帧内预测复杂度,并提出将当前编码块、相邻参考块以及残差块三者作为网络的输入,把率失真决策过程转换为分类问题,减少不必要的模式遍历。为训练所提出的深度学习网络,本文针对H.266的特点建立了模式决策数据集。实验结果表明,文章提出的算法与VTM10.0相比,编码时间平均降低了39.56%~43.45%,有效的降低了编码的计算复杂度,同时率失真性能基本保持不变,与最新参考文献相比综合性能也有所提升。

    Abstract:

    The new generation of video compression standard (H.266/VVC) provides 67 prediction modes in intra-frame prediction, which greatly improves the coding efficiency, but also brings extremely high computational complexity. This paper proposes a fast algorithm for intra-mode decision-making based on deep learning. First, for the problem of the size and shape of the block after the size of the coding block is divided, the extracted brightness block is preprocessed, and the block size and quality are guaranteed through random cropping, resampling, and convolutional neural network (CNN) upsampling. . Then the CNN architecture is carefully designed to reduce the complexity of intra prediction, and it is proposed to use the current coding block, the adjacent reference block and the residual block as the input of the network to convert the rate-distortion decision-making process into a classification problem and reduce unnecessary Pattern traversal. In order to train the proposed deep learning network, this paper establishes a model decision data set based on the characteristics of H.266. Experimental results show that compared with VTM10.0, the algorithm proposed in the article reduces the coding time by 39.56%~43.45% on average, which effectively reduces the computational complexity of coding, while the rate-distortion performance remains basically unchanged, which is comparable to the latest references. The overall performance has also been improved.

    参考文献
    相似文献
    引证文献
引用本文

施金诚,杨静.基于深度学习的VVC快速帧内模式决策[J].电子测量技术,2022,45(3):104-111

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-06-14
  • 出版日期:
文章二维码