基于强化学习的多段连续体机器人轨迹规划

首页 > 过刊浏览>2024年第47卷第5期 >61-69

基于强化学习的多段连续体机器人轨迹规划
DOI:
                        
                    
CSTR:
                        [cstr]
                    
作者:
                        刘宜成1刘宜成
四川大学电气工程学院 成都 610065
在期刊界中查找
在百度中查找
在本站中查找
杨迦凌1杨迦凌
四川大学电气工程学院 成都 610065
在期刊界中查找
在百度中查找
在本站中查找
梁斌2梁斌
清华大学自动化系 北京 100084
在期刊界中查找
在百度中查找
在本站中查找
陈章2陈章
清华大学自动化系 北京 100084
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:1.四川大学电气工程学院 成都 610065; 2.清华大学自动化系 北京 100084
作者简介:
通讯作者:
中图分类号:TP242；TP399
基金项目:清华大学横向协作项目（HG2020153）资助

Trajectory planning of multi-stage continuum robot based on reinforcement learning

Author:

Liu Yicheng ^¹
Liu Yicheng
College of Electrical Engineering, Sichuan University，Chengdu 610065, China
在期刊界中查找
在百度中查找
在本站中查找
Yang Jialing ^¹
Yang Jialing
College of Electrical Engineering, Sichuan University，Chengdu 610065, China
在期刊界中查找
在百度中查找
在本站中查找
Liang Bin ^²
Liang Bin
Department of Automation, Tsinghua University，Beijing 100084, China
在期刊界中查找
在百度中查找
在本站中查找
Chen Zhang ^²
Chen Zhang
Department of Automation, Tsinghua University，Beijing 100084, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

1.College of Electrical Engineering, Sichuan University，Chengdu 610065, China; 2.Department of Automation, Tsinghua University，Beijing 100084, China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对多段连续体机器人的轨迹规划问题，提出了一种基于深度确定性策略梯度强化学习的轨迹规划算法。首先，基于分段常曲率假设方法，建立连续体机器人的关节角速度和末端位姿的正向运动学模型。然后，采用强化学习算法，将机械臂的当前位姿和目标位姿等信息作为状态输入，将机械臂的关节角速度作为智能体的输出动作，设置合理的奖励函数，引导机器人从初始位姿向目标位姿移动。最后，在MATLAB中搭建仿真系统，仿真结果表明，强化学习算法成功对多段连续体机器人进行轨迹规划，控制连续体机器人的末端平稳运动到目标位姿。

关键词:连续体机器人;轨迹规划;强化学习;位姿控制;奖励引导

Abstract:

For the trajectory planning of multi-stage continuum robots, a trajectory planning algorithm based on deep deterministic policy gradient reinforcement learning is proposed. Firstly, based on the piecewise constant curvature hypothesis, the forward velocity kinematic model of joint angular velocity and end pose of the continuum robot is established. Then, the reinforcement learning algorithm is used to take the current pose and target pose of the robot arm as state input, the joint angular velocity of the robot arm as the output action of the agent, and a reasonable reward function is set to guide the robot to move from the initial pose to the target pose. Finally, a simulation system is built in MATLAB, and the simulation results show that the reinforcement learning algorithm successfully performs trajectory planning for the multi-segment continuum robot and controls the end of the continuum robot to move smoothly to the target pose.

Key words:continuum robot;trajectory planning;reinforcement learning;position and pose control;reward guidance

引用本文

刘宜成,杨迦凌,梁斌,陈章.基于强化学习的多段连续体机器人轨迹规划[J].电子测量技术,2024,47(5):61-69

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2024-06-05
出版日期:

网站首页

杂志简介

过刊浏览

投稿须知

欢迎订阅

联系我们

English

引用本文

分享

文章指标

历史

文章二维码

网站首页

杂志简介

过刊浏览

投稿须知

欢迎订阅

联系我们

English

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码