D2D通信中基于深度强化学习的资源分配

首页 > 过刊浏览>2022年第45卷第24期 >76-84

D2D通信中基于深度强化学习的资源分配
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        
                        
                    
作者单位:1.南京信息工程大学 南京 210044; 2.无锡学院 无锡 214105; 3.江南大学轻工过程先进控制教育部 重点实验室 无锡 214122; 4.北京邮电大学网络与交换技术国家重点实验室 北京 100876
作者简介:
通讯作者:
中图分类号:TN929.5
基金项目:国家自然科学基金（61571108）、网络与交换技术国家重点实验室(北京邮电大学)开放课题资助项目(SKLNST-2020-1-13)资助

Resource allocation based on deep reinforcement learning in D2D communication

Author:

Affiliation:

1.Nanjing University of Information Science & Technology，Nanjing 210044, China; 2.Wuxi University，Wuxi 214105, China; 3.Key Laboratory of Advanced Control of Light Industry Process, Ministry of Education, Jiangnan University， Wuxi 214122, China; 4.State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications，Beijing 100876, China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

设备到设备(D2D)通信能够以蜂窝设施为基础来提高资源利用率、用户吞吐量和节省电池能量。在D2D网络中，模式选择和资源分配是关键问题。为了提高D2D通信的和速率与频谱利用效率，提出一种联合模式选择、功率和资源块分配的方案。首先根据用户地理位置选定模式选择标准，帮助用户选择相应的通信模式；然后针对复用通信模式，使用基于深度强化学习的异步优势动作评价(A3C)算法为不同的D2D用户分配资源块和功率。仿真结果表明，本文提出的基于A3C算法的联合优化方案收敛速度快，并且性能相对于其他算法较好。

Abstract:

Device to device (D2D) communication can be based on cellular facilities to improve resource utilization, user throughput and save battery energy. In D2D network, mode selection and resource allocation are the key issues. In order to improve the sum rate and spectrum efficiency of D2D communication, a scheme of joint mode selection, power and resource block allocation is proposed. Firstly, the mode selection criteria are selected according to the user′s geographical location to help the user select the corresponding communication mode; Then, for the multiplexing communication mode, the asynchronous dominant action evaluation (A3C) algorithm based on deep reinforcement learning is used to allocate resource blocks and power to different D2D users. The simulation results show that the joint optimization scheme based on A3C algorithm proposed in this paper has fast convergence speed and better performance than other algorithms.

参考文献

相似文献

引证文献

引用本文

沈国丽,李君,李正权. D2D通信中基于深度强化学习的资源分配[J].电子测量技术,2022,45(24):76-84

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2024-03-08
出版日期:

网站首页

杂志简介

过刊浏览

投稿须知

欢迎订阅

联系我们

English

引用本文

分享

文章指标

历史

文章二维码