Computer Integrated Manufacturing System ›› 2025, Vol. 31 ›› Issue (8): 2857-2869.DOI: 10.13196/j.cims.2024.0187

Previous Articles     Next Articles

Collaborative manufacturing scheduling based on deep reinforcement learning

TANG Liang,KUANG Lilin   

  1. College of Transportation Engineering,Dalian Maritime University
  • Online:2025-08-31 Published:2025-09-04
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.72372015),the National Social Science Foundation,China(No.24GBL026).

基于深度强化学习算法的协同制造调度优化

唐亮,匡理霖   

  1. 大连海事大学交通运输工程学院
  • 作者简介:
    唐亮(1980-),男,江苏宜兴人,教授,博士,研究方向:供应链调度及优化、智能制造等,E-mail:tangericliang@dlmu.edu.cn;

    匡理霖(1998-),男,四川内江人,硕士研究生,研究方向:协同制造、深度强化学习,E-mail:kllddu88@dlmu.edu.cn。
  • 基金资助:
    国家自然科学基金资助项目(72372015);国家社科基金资助项目(24BGL026)。

Abstract: To solve a collaborative manufacturing scheduling problem where the Dominant Manufacturer (DM) outsources several processes to Collaborative Manufacturers (CMs) with the objective of minimizing overall costs for different orders,a Deep Reinforcement Learning (DRL) framework was proposed,which integrated disjunctive graph analysis to tackle the intricate scheduling dynamics inherent in collaborative manufacturing networks.In this way,the Agent learns the action strategy based on the input order status.The scheduling problem was transformed into a sequential decision-making task by employing a two-dimensional action space derived from the disjunctive graph structure.Through setting the manufacturing state as the input of a deep neural network model,the collaborative scheduling problem was transformed into a Markov Decision Process (MDP) problem.Additionally,a cost-oriented reward function was formulated to guide the exploration process of the agent aiming to identify the optimal action.The experimental results showed that the deep reinforcement learning algorithm outperformed any single scheduling rule,and also performed more superiorly on average in terms of solution effectiveness when compared to genetic algorithms.

Key words: collaborative manufacturing, disjunctive graph, deep reinforcement learning, scheduling, Markov decision process

摘要: 针对主导制造商将多个流程外包给协同制造商的协同调度问题,考虑同类产品不同订单在协同制造网络中的分配与调度,以多种成本之和的最小化为目标,提出了一种结合析取图的深度强化学习算法框架。根据输入的订单状态学习动作策略,将析取图的调度过程转化为一个多阶段的序列决策过程;利用基于析取图的状态空间,将协同制造订单的状态视为多通道图像输入网络,并依据状态转移的特点设计了包含订单选择规则和协同制造商指派规则的二维动作空间。根据问题的目标构造了关于成本的奖励函数,以指导智能体与环境交互,获取每个决策步最佳策略。实验结果表明,深度强化学习算法优于单一调度规则,与遗传算法相比较,在平均求解效果上表现更为优越。

关键词: 协同制造, 析取图, 深度强化学习, 调度, 马尔可夫决策过程

CLC Number: