Computer Integrated Manufacturing System ›› 2024, Vol. 30 ›› Issue (2): 708-716.DOI: 10.13196/j.cims.2021.0607

Previous Articles     Next Articles

Multi-AGV motion planning based on deep reinforcement learning

SUN Hui,YUAN Wei   

  1. School of Mechanical Engineering,Southeast University
  • Online:2024-02-29 Published:2024-03-08
  • Supported by:
    Project supported by the Ministry of Industry and Information Technology,China(No.[2016]213).

基于深度强化学习的多自动导引车运动规划

孙辉,袁维   

  1. 东南大学机械工程学院
  • 基金资助:
    2016年智能制造综合标准化资助项目(工信部联装[2016]213号)。

Abstract: To solve the problem of multi-Automated Guided Vehicle(AGV)conflict-free motion planning in mobile robot fulfillment systems,a Markov Decision Process(MDP)model was constructed,then a novel planning approach based on Deep Q-Network(DQN)was proposed.With AGVs'positions as inputs,the DQN was trained by using classical deep Q-learning algorithm and was used to estimate the maximum expected cumulative reward received from taking each action.Computational results of problem instances showed that the proposed approach could effectively overcome the potential collisions of AGV fleet in motion,and thus enabled the AGV fleet to accomplish all rack transportation tasks with conflict-free.Furthermore,compared to an existing planning heuristic in the literature,the motion plans of AGVs generated from the proposed approach requid shorter average makespans.

Key words: multi-automated guided vehicle, motion planning, Markov decision process, deep Q-network, deep Q-learning

摘要: 为解决移动机器人仓储系统中的多自动导引车(AGV)无冲突运动规划问题,建立了Markov决策过程模型,提出一种新的基于深度Q网络(DQN)的求解方法。将AGV的位置作为输入信息,利用DQN估计该状态下采取每个动作所能获得的最大期望累计奖励,并采用经典的深度Q学习算法进行训练。算例计算结果表明,该方法可以有效克服AGV车队在运动中的碰撞问题,使AGV车队能够在无冲突的情况下完成货架搬运任务。与已有启发式算法相比,该方法求得的AGV运动规划方案所需要的平均最大完工时间更短。

关键词: 多自动导引车, 运动规划, Markov决策过程, 深度Q网络, 深度Q学习

CLC Number: