计算机集成制造系统 ›› 2025, Vol. 31 ›› Issue (12): 4608-4620.DOI: 10.13196/j.cims.2024.0602

• • 上一篇    下一篇

基于多智能体深度强化学习求解分布式异构作业车间动态调度问题

王丽君1+,王成广2,李相阳2,CHENG Ruixue3,文笑雨4   

  1. 1.华北水利水电大学机械学院
    2.华北水利水电大学管理与经济学院
    3.提赛德大学可持续工程中心
    4.郑州轻工业大学河南省机械装备智能制造重点实验室
  • 出版日期:2025-12-31 发布日期:2026-01-08
  • 作者简介:
    +王丽君(1971-),女,山西太原人,二级教授,博士生导师,研究方向:智能制造、信息管理与系统仿真等,通讯作者,E-mail:wljmb@163.com;

    王成广(2000-),男,河南驻马店人,博士研究生,研究方向:智能制造、智能优化与调度等,E-mail:201709502@stu.ncwu.edu.cn;

    李相阳(1996-),男,河南焦作人,博士研究生,研究方向:智能优化算法、故障诊断等,E-mail:lxylxy1231@163.com;

    CHENG Ruixue(1961-),女,英国人,讲师,博士,研究方向:智能制造系统、机器学习等,E-mail:r.cheng@tees.ac.uk;

    文笑雨(1988-),女,河南唐河人,副教授,博士,硕士生导师,研究方向:车间调度、智能优化算法、工业数字孪生,E-mail:xiaoyuup@gmail.com。
  • 通讯作者简介:王丽君(1971-),女,山西太原人,二级教授,博士生导师,研究方向:智能制造、信息管理与系统仿真等,通讯作者,E-mail:wljmb@163.com
  • 基金资助:
    国家自然科学基金面上资助项目(52475543);科技部外国专家项目(G2023026004L);河南省高校科技创新人才支持计划资助项目(24HASTIT048)。

Dynamic scheduling problem in distributed heterogeneous job shops based on multi-agent deep reinforcement learning

WANG Lijun1+,WANG Chengguang2,LI Xiangyang2,CHENG Ruixue3,WEN Xiaoyu4   

  1. 1.School of Mechanical Engineering,North China University of Water Resources and Electric Power
    2.School of Management and Economics,North China University of Water Resources and Electric Power
    3.Centre for Sustainable Engineering,Teesside University
    4.Henan Provincial Key Laboratory of Intelligent Manufacturing of Mechanical Equipment,Zhengzhou University of Light Industry
  • Online:2025-12-31 Published:2026-01-08
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.52475543),the Foreign Expert Project of Ministry of Science and Technology,China(No.G2023026004L),and the  Henan Provincial Universities Science and Technology Innovation Talent Support Plan,China(No.24HASTIT048).

摘要: 针对考虑工件动态插入及转移时间的分布式异构作业车间调度问题,以最小化总拖期时间为目标,提出一种基于决斗双深度 Q 网络的多智能体深度强化学习(MAD3QN)方法。该问题涉及到工件选择与机器分配两个耦合的决策过程,因此创建了两类智能体,分别为两个智能体制定了两个马尔可夫决策过程。针对工件选择智能体和机器分配智能体,详细描述了两个智能体的状态表示、动作空间和奖励设置,以实现更高效的决策。最后,为了验证所提方法中在不同规模实例下的有效性,与复合调度规则进行了对比;进一步,与启发式调度算法及其他深度强化学习方法相比,验证所提方法在不同规模下的优越性。

关键词: 分布式车间, 多智能体, 动态调度, 深度强化学习

Abstract: Aiming at the distributed heterogeneous job shop scheduling problem considering transfer time for dynamic insertion of workpieces,a Multi-Agent Deep reinforcement learning method based on Dueling Double-Deep Q-network(MAD3QN) was proposed with the objective of minimizing the total delay time.This challenge involved two interdependent decision processes:workpiece selection and machine allocation,leading to the creation of two different types of agents,each governed by its own Markov decision process.The state representations,action spaces and reward structures for both the workpiece selection agent and the machine allocation agent were carefully defined to facilitate more efficient decision making.Finally,to validate the effectiveness of the proposed method at different scales,comparisons were made with composite scheduling rules;further evaluations demonstrated the superiority of the proposed approach over heuristic scheduling algorithms and other deep reinforcement learning methods at different scales.

Key words: distributed job shop, multi-agents, dynamic scheduling, deep reinforcement learning

中图分类号: