基于注意力改进RTformer的滑坡遥感图像语义分割

基于注意力改进RTformer的滑坡遥感图像语义分割
DOI:
                        
                    
作者:
                        
                        
                    
作者单位:1.贵州大学矿业学院;2.贵州大学农学院
作者简介:
通讯作者:
中图分类号:TP751;TN20
基金项目:贵州省省级科技计划项目 (黔科合支撑[2022]一般204,黔科合基础-ZK[2024]一般093)

Semantic segmentation of landslide remote sensing image based on improved Attention RTformer

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对现有的遥感影像滑坡语义分割网络存在模型参数量大、训练速度较慢，滑坡边界区域识别模糊、遥感影像多尺度语义信息分类差异化等问题，本文提出一种改进的RTformer轻量级语义分割模型，在模型不同层级模块间嵌入空洞卷积注意力ASPP模块和通道注意力SE模块，以捕捉不同尺度的语义信息和通过计算通道关系从而增强特征表示能力，提高模型特征提取能力，使其更加适用于滑坡遥感影像识别任务。利用Cityscapes数据集针对模型中空洞卷积的膨胀率设置和不同批量大小进行对比试验以得到最优解，以毕节滑坡灾害数据集做为预训练数据集设计一个自监督训练任务，并使用其进行模型微调并检验模型针对滑坡灾害遥感影像的分割性能。最终得到的模型在Cityscapes数据集和毕节市滑坡灾害数据集上均获得了最优表现，相比原始RTformer模型，两个数据集的平均交并比（mIOU）分别提升了2.26%和4.34%。并且与FCN、U-Net、DeeplabV3、SegFormer等经典语义分割模型相比，改进模型以最少的参数和最快的推理速度实现了识别任务，并达到了最优分割效果。

Abstract:

Aiming at the existing problems of landslide semantic segmentation network of remote sensing image, such as large number of model parameters, slow training speed, fuzzy recognition of landslide boundary region, and differentiation of multi-scale semantic information classification of remote sensing image, this paper proposes an improved lightweight semantic segmentation model of RTformer. The cavity convolution attention ASPP module and channel attention SE module were embedded among the modules at different levels of the model to capture semantic information at different scales and to enhance the feature representation ability and improve the feature extraction ability of the model, making it more suitable for landslide remote sensing image recognition. Cityscapes data set was used to conduct comparative experiments on the expansion rate setting of the cavity convolution in the model and different batch sizes to obtain the optimal solution. A self-supervised training task was designed using the Bijie landslide disaster data set as the pre-training data set, and the model was fine-tuned and the segmentation performance of the model against the landslide disaster remote sensing images was tested. The resulting model achieved the best performance on both Cityscapes dataset and Bijie landslide disaster dataset. Compared with the original RTformer model, the mean crossover ratio (mIOU) of the two datasets increased by 2.26% and 4.34%, respectively. Compared with the classical semantic segmentation models such as FCN, U-Net, DeeplabV3 and SegFormer, the improved model realizes the recognition task with the fewest parameters and the fastest reasoning speed, and achieves the optimal segmentation effect.

参考文献

相似文献

引证文献

引用本文

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-08-17
最后修改日期:2024-09-28
录用日期:2024-10-16
在线发布日期:
出版日期:

网站首页

杂志简介

过刊浏览

投稿须知

欢迎订阅

联系我们

English

引用本文

分享

文章指标

历史