Computer Integrated Manufacturing System ›› 2025, Vol. 31 ›› Issue (3): 998-1013.DOI: 10.13196/j.cims.2022.0745

Previous Articles     Next Articles

Visual servo intelligent control method for robot arms based on deep reinforcement learning

YUAN Qingni1,QI Jianyou1+,YU Hongjian2   

  1. 1.Key Laboratory of Modern Manufacturing Technology,Ministry of Education,Guizhou University
    2.Longrise Technology Co.,Ltd.,Innovation Division
  • Online:2025-03-31 Published:2025-04-03
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.52165063,52065010),and the Science Foundation of Guizhou Province,China(No.[2022]K024,[2023]G094,[2023]G125).

基于深度强化学习的机械臂视觉伺服智能控制

袁庆霓1,齐建友1+,虞宏建2   

  1. 1.贵州大学现代制造技术教育部重点实验室
    2.永兴元科技有限公司创新事业部
  • 作者简介:
    袁庆霓(1976-),女,贵州贵阳人,教授,博士,博士生导师,研究方向:智能机器人、进化算法和数字制造等,E-mail:qnyuan@gzu.edu.cn;

    +齐建友(1995-),男,江西上饶人,硕士研究生,研究方向:机器人运动规划和智能控制,通讯作者,E-mail:scienceqjy@163.com;

    虞宏建(1996-),男,江西上饶人,软件工程师,学士,研究方向:人工智能研究和软件设计,E-mail:15180401120@163.com。
  • 基金资助:
    国家自然科学基金资助项目(52165063,52065010);贵州省科技厅资助项目( [2022]重点024,[2022]一般140,[2023]一般094,[2023]一般125)。

Abstract: Aiming at the low servo accuracy,poor stability and lack of visibility constraints of visual servo control system,a multi-strategy fusion visual servo control method based on depth reinforcement learning adaptive gain was proposed.The robot arm visual servo system with Eye-in-Hand(EIH)was built,and an Image-based Visual Servo(IBVS)controller by integrating Sliding Mode Control(SMC)and Classical proportional control named SMCC-IBVS was designed.Aiming at the limited field of feature loss,the process of servo selection gain was constructed as a Markov Decision Process(MDP)model,and on this basis,an adaptive servo gain algorithm based on Deep Deterministic Policy Gradient(DDPG)was designed.Simulation and scene experiments were conducted on a robotic arm.The experimental results showed that the proposed method could quickly achieve positioning without feature loss.Compared to the Dyna-Q learning IBVS,the positioning accuracy and stability were greatly improved,and the servo control time was within 5 seconds.To verify the practicability of the method,the assembly experiment was carried out for the parts with the minimum clearance of 0.2mm of the shaft hole,and the assembly success rate reached 99%.

Key words: visual servo, deep deterministic policy gradient learning strategy, adaptive gain, robot arm, hybrid sliding mode control, visibility constraints

摘要: 针对视觉伺服控制系统存在伺服精度低、收敛速度慢和缺乏可见性约束等问题,提出一种基于深度强化学习的自适应调整多策略控制器伺服增益方法,用于机械臂智能控制。首先搭建眼在手配置(EIH)的机械臂视觉伺服系统。然后,融合比例控制与滑模控制(SMC)设计基于图像的视觉伺服控制器(SMCC-IBVS);针对控制系统特征丢失的问题,将伺服选择增益的过程构建为马尔可夫决策过程(MDP)模型,在此基础上,设计基于深度确定性策略梯度(DDPG)的自适应伺服增益算法,通过深度强化学习来自适应调整控制器(SMCC-IBVS)伺服增益,减少伺服误差,提高效率和稳定性。最后,仿真和物理实验结果表明,使用DDPG学习调控增益的SMCC-IBVS控制器具有强鲁棒性和快速收敛性,且在很大程度上避免了特征丢失;机械臂轴孔装配实验结果也表明,所提出的视觉伺服系统实用性能较强,针对轴孔最小间隙为0.2mm间隙配合的装配实验成功率可达99%。

关键词: 视觉伺服, DDPG学习策略, 自适应增益, 机械臂, 混合滑模控制, 可见性约束

CLC Number: