Computer Integrated Manufacturing System ›› 2025, Vol. 31 ›› Issue (11): 4071-4084.DOI: 10.13196/j.cims.2024.0365

Previous Articles     Next Articles

Multi-robot conflict-free path planning method for intelligent manufacturing-oriented part-picking

YUAN Ruiping1,2,FU Zhijia1,2,LI Juntao1,2+,WANG Wei3,JIANG Yingfan1,2   

  1. 1.School of Computer Science and Artificial Intelligence,Beijing Wuzi University
    2.Beijing Municipal Key Laboratory of Intelligent Logistics System
    3.Xi'an Branch of Shentong Express Co.,Ltd.
  • Online:2025-11-30 Published:2025-12-04
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.72101033),the Key Project of Science and Technology Plan of Beijing Municipal Education Commission,China(No.KZ202210037046),and the Excellent Science and Technology Innovation Team Project of Tongzhou District,China(No.CXTD2023010).

面向智能制造零件拣选的多机器人无冲突路径规划方法

袁瑞萍1,2,傅之家1,2,李俊韬1,2+,王伟2,3,姜盈帆1,2   

  1. 1.北京物资学院计算机与人工智能学院
    2.智能物流系统北京市重点实验室
    3.申通快递有限公司西安分公司
  • 作者简介:
    袁瑞萍(1982-),女,山东菏泽人,教授,博士,博士生导师,研究方向:智能物流系统,E-mail:angelholyping @163.com;

    傅之家(1998-),男,重庆人,硕士,研究方向:深度强化学习算法,E-mail:fuzj023@163.com;

    +李俊韬(1978-),男,河南开封人,教授,博士,研究方向:机器人与智能制造,通讯作者,E-mail:ljtletter@126.com;


    王伟(1996-),男,陕西西安人,硕士,研究方向:智能调度,E-mail:wletterw@163.com;

    姜盈帆(1999-),女,辽宁阜新人,博士研究生,研究方向:智能调度与优化算法,E-mail:jyfnano666@163.com。
  • 基金资助:
    国家自然科学基金资助项目(72101033);北京市教委科技计划重点资助项目(KZ202210037046);通州区优秀科技创新团队资助项目(CXTD2023010)。

Abstract: In intelligent manufacturing-oriented scenarios involving part-picking operations,the deployment of multiple robots in complex and dynamic environments significantly increases the likelihood of congestion and conflicts.Traditional path planning methods demonstrate inefficiencies in adapting to dynamic environmental changes and resolving path conflicts.To tackle this challenge,a conflict-free multi-robot path planning method based on Deep Reinforcement Learning(DRL)was proposed.A partially observable Markov decision process was utilized to model the robot path planning framework.The reward function was designed to incorporate potential energy guidance and area density rewards aimed at reducing congestion and conflicts.Then,an enhanced Multi-Agent Advantage Actor-Critic(MAA2C)algorithm for path planning was proposed,which improved information transfer and sharing among robots by introducing the attention mechanism,thereby algorithmic efficiency were enhanced.Simulation results of part picking in intelligent automobile assembly indicated that the proposed algorithm had the fastest convergence speed and the lowest total task picking time,and the superiority of the proposed model and algorithm was verified when applied to multi robot path planning in intelligent part-picking scenarios.

Key words: intelligent manufacturing, part-picking, deep reinforcement learning, multi-robot path planning, attention mechanism

摘要: 当机器人被应用于智能制造场景进行零件拣选作业时,多机器人在复杂动态环境中运行极易发生拥堵和冲突,而传统路径规划方法在感知环境动态变化和处理路径冲突方面效率不高。本文提出了一种基于深度强化学习的多机器人无冲突路径规划方法来求解该问题。首先,采用部分可观测的马尔可夫决策过程对机器人路径规划过程进行建模,在奖励函数设计中加入势能引导和区域密度奖励减少拥堵、冲突的发生。其次,提出一种改进的多智能体优势演员-评论家路径规划算法,通过引入注意力机制加强机器人之间的信息传递和共享,提高算法的效率。汽车智能装配场景下零件拣选仿真实验表明,相比其他路径规划算法,本文所提算法收敛速度最快、完成所有订单拣选总耗时最短,验证了所提出的多机器人路径规划模型和算法应用于智能制造零件拣选场景中的优越性。

关键词: 智能制造, 零件拣选, 深度强化学习, 多机器人路径规划, 注意力机制

CLC Number: