Computer Integrated Manufacturing System ›› 2023, Vol. 29 ›› Issue (3): 789-800.DOI: 10.13196/j.cims.2023.03.009

Previous Articles     Next Articles

Assembly task allocation of human-robot collaboration based on deep reinforcement learning

XIONG Zhihua1,CHEN Hao2,WANG Changsheng1,YUE Ming1,HOU Wenbin1,3+,XU Bin2   

  1. 1.School of Automotive Engineering,Dalian University of Technology
    2.BMW Brilliance Automotive Ltd.
    3.Ningbo Institute of Dalian University of Technology
  • Online:2023-03-31 Published:2023-04-18
  • Supported by:
    Project supported by the National Natural Science Foundation,China (No.52072057).

基于深度强化学习的人机协作组装任务分配

熊志华1,陈昊2,王长生1,岳明1,侯文彬1,3+,徐斌2   

  1. 1.大连理工大学汽车工程学院
    2.华晨宝马汽车有限公司
    3.大连理工大学宁波研究院
  • 基金资助:
    国家自然科学基金资助项目(52072057)。

Abstract: To adapt to the increasingly complex task structure and high-dimensional task space of Human-Robot Collaboration (HRC) assembly task allocation,a task allocation method based on deep reinforcement learning was proposed.To model a generalized solution environment,the HRC task allocation was formalized as a reinforcement learning problem.A 4-channel image was designed to indicate the environment state,and the execution environment was constructed as an assembly breakthrough game.In view of the inefficiency of exploration caused by frequent episode restarts of Deep Q-Networks (DQN) algorithm,an archive mechanism and its improved algorithm Archive Double DQN (DDQN) were proposed.Besides,the process of HRC assembly task allocation based on the interaction between Archive DDQN and assembly execution environment was introduced.The effectiveness of proposed method was verified through comparison experiments in two assembly execution environments with different difficulty.

Key words: deep reinforcement learning, archive mechanism, human-robot collaboration, task allocation, production assembly

摘要: 为适应人机协作组装任务分配日趋复杂的任务结构和高维的任务状态空间,提出了一种基于深度强化学习的人机协作组装任务分配方法。首先,将人机协作组装任务分配形式化为强化学习问题,设计了4通道帧图进行任务分配环境状态的表示,并构建了通用化的组装闯关游戏模拟环境。其次,为解决深度Q网络(DQN)算法频繁的情节重启导致探索效率低下的问题,提出了存档机制及其改进算法Archive DDQN(Double DQN),并介绍了利用该算法与模拟环境交互以进行人机协作组装任务分配的流程方法。最后,通过2种不同难度的组装模拟环境进行对比实验,验证了所提出方法的有效性。

关键词: 深度强化学习, 存档机制, 人机协作, 任务分配, 生产组装

CLC Number: