Computer Integrated Manufacturing System ›› 2024, Vol. 30 ›› Issue (12): 4282-4291.DOI: 10.13196/j.cims.2023.0369

Previous Articles     Next Articles

Collaborative obstacle avoidance trajectory planning for mobile robotic arms based on artificial potential field DDPG algorithm

LI Yong,ZHANG Chaoxing+,CHAI Liaoning   

  1. Key Laboratory of Industrial Internet of Things and Network Control,Ministry of Education,Chongqing University of Posts and Telecommunications
  • Online:2024-12-31 Published:2025-01-06
  • Supported by:
    Project supported by the National Key R&D Program,China (No.2020YFB1708800).

基于人工势场DDPG算法的移动机械臂协同避障轨迹规划

李勇,张朝兴+,柴燎宁   

  1. 重庆邮电大学工业物联网与网络化控制教育部重点实验室
  • 作者简介:
    李勇(1976-),男,四川崇州人,副教授,博士,研究方向:智能制造系统(工业机器人)、深度学习,E-mail:liyong@cqupt.edu.cn;

    +张朝兴(1998-),男,四川成都人,硕士研究生,研究方向:深度强化学习与机器人技术,通讯作者,E-mail:847256511@qq.com;

    柴燎宁(1997-),男,河南巩义人,硕士研究生,研究方向:移动机器人技术,E-mail:1594841412@qq.com。
  • 基金资助:
    国家重点研发计划资助项目(2020YFB1708800)。

Abstract: To improve the obstacle avoidance trajectory planning ability of mobile robotic arm in narrow channel and obstacle constraint situations,by combining Artificial Potential Field method (APF) and Deep Deterministic Policy Gradient algorithm (DDPG),an improved algorithm named APF-DDPG was proposed.The APF planning was designed for the robotic arm to get the approximate pose,and the research problem was represented as a Markov decision process.The state space,action space and reward and punishment functions were designed,and the planning process was analyzed and processed in phases.A mechanism for guiding was designed to transition the various control phases,which the obstacle avoidance phase of the training was dominated by DDPG,and the approximate pose dominated the goal planning phase to guide the DDPG for the training.Thus the strategy model for planning was obtained from the training.Finally,simulation experiments of fixed and random state scenarios were established and designed to verify the effectiveness of the proposed algorithm.The experimental results showed that APF-DDPG algorithm could be trained with higher convergence efficiency to obtain a policy model with more efficient control performance by comparing with the traditional DDPG algorithm.

Key words: mobile robotic arm, obstacle avoidance trajectory planning, artificial potential field, deep deterministic policy gradient, guided training

摘要: 为了提高移动机械臂在狭窄通道和障碍物约束情况的避障轨迹规划能力,提出一种人工势场法(APF)和深度确定性策略梯度算法(DDPG)结合的改进算法(APF-DDPG)。首先,对机械臂设计了APF规划得到近似姿态,再将研究问题表示为马尔科夫决策过程,设计了状态空间、动作空间和奖惩函数,对规划过程进行阶段性分析处理,设计了一种引导机制来过渡各控制阶段,即避障阶段由DDPG主导训练,目标规划阶段由近似姿态引导DDPG训练,最终获得用于规划的策略模型。最后,建立并设计了固定和随机状态场景的仿真实验,验证了所提算法的有效性。实验结果表明,相较于传统DDPG算法,APF-DDPG算法能够以更高收敛效率训练得到具有更高效控制性能的策略模型。

关键词: 移动机械臂, 避障轨迹规划, 人工势场法, 深度确定性策略梯度, 引导训练

CLC Number: