Improved proximal policy optimization algorithm for solving flexible job shop scheduling problem

doi:10.13196/j.cims.2023.0138

Computer Integrated Manufacturing System ›› 2025, Vol. 31 ›› Issue (8): 2894-2904.DOI: 10.13196/j.cims.2023.0138

Previous Articles Next Articles

Improved proximal policy optimization algorithm for solving flexible job shop scheduling problem

WU Haoze,LI Yanwu⁺,XIE Hui

College of Electronic and Information Engineering,Chongqing Three Gorges University

Online:2025-08-31 Published:2025-09-04
Supported by:
Project supported by the Science and Technology Program of Chongqing Municipal Education Commission,China(No.KJQN202001224).

改进PPO算法求解柔性作业车间调度问题

吴昊泽,李艳武⁺,谢辉

重庆三峡学院电子与信息工程学院

作者简介:
吴昊泽(1997-),男,山东淄博人,硕士研究生,研究方向:智能优化算法,E-mail:247460692@qq.com;

+李艳武(1985-),男,湖北天门人,高级工程师,博士,硕士生导师,研究方向:流水车间调度问题、智能优化算法、深度学习等,通讯作者,E-mail:liyanwu2022@sina.com;

谢辉(1969-),女,重庆万州人,教授,硕士生导师,研究方向:智能优化算法,E-mail:20030041@sanxiau.edu.cn。
基金资助:
重庆市教育委员会科学技术项目研究(KJQN202001224)。

Abstract

Abstract: The Flexible Job-shop Scheduling Problem (FJSP) needs to improve scheduling efficiency and shorten production cycle time.To minimize the maximum completion time,a mixed integer programming model was established by using two scheduling strategies of machine selection and workpiece process adjustment,and a Deep Reinforcement Learning (DRL) algorithm based on strategy and Graph Neural Network (GNN) was proposed to solve the problem.The algorithm used graph neural network to obtain and analyze the information of disjunctive graph,which provided decision-making basis for reinforcement learning.The algorithm using Multiple proximal Policy Optimization (Multi-PPO) and Multi-pointer Graph Network (MPGN) was proposed to learn the operation action strategy and machine action strategy.Two encoder-decoders were designed to define two action strategies,and the graph neural network was embedded into the local state to enhance the local search ability.The experimental results showed that the proposed algorithm was significantly better than the comparison algorithm in solving performance and generalization ability.

Key words: deep reinforcement learning, flexible job-shop scheduling problem, disjunctive graph, graph neural network, multiple proximal policy optimization

摘要： 柔性作业车间需要提高调度效率,缩短生产周期。为此以最小化最大完工时间为目标,采用机器选择和工件工序调整两种调度策略,建立混合整数规划模型,并提出了一种基于策略和图神经网络的深度强化学习算法来求解。该算法利用图神经网络获取和分析析取图的信息,为强化学习提供决策依据;提出了使用多近端策略优化和多指针图网络学习作业操作动作策略和机器动作策略,设计两个编码器-解码器定义两个动作策略,并将图神经网络嵌入到局部状态,增强局部搜索能力。实验结果表明所提出的算法在求解性能和泛化能力方面显著优于对比算法。

关键词: 深度强化学习, 柔性作业车间调度问题, 析取图, 图神经网络, 多近端策略优化算法

CLC Number:

WU Haoze, LI Yanwu, XIE Hui. Improved proximal policy optimization algorithm for solving flexible job shop scheduling problem[J]. Computer Integrated Manufacturing System, 2025, 31(8): 2894-2904.

吴昊泽, 李艳武, 谢辉. 改进PPO算法求解柔性作业车间调度问题[J]. 计算机集成制造系统, 2025, 31(8): 2894-2904.

[1]	TANG Liang, KUANG Lilin. Collaborative manufacturing scheduling based on deep reinforcement learning [J]. Computer Integrated Manufacturing System, 2025, 31(8): 2857-2869.
[2]	ZHU Rui, XIAO Honghao, LI Wenxin, HU Quanzhou, SONG Junqiao, HU Shengnan, CHEN Yeting. Automatic business process generation based on abstract label sequence and large language model [J]. Computer Integrated Manufacturing System, 2025, 31(5): 1639-1650.
[3]	XIA Taizi, TANG Qiuhua, CHENG Lixin. Energy-efficient scheduling optimization of flexible job-shop scheduling based on DQN co-evolutionary algorithm [J]. Computer Integrated Manufacturing System, 2025, 31(2): 411-422.
[4]	HUANG Renxian, LUO Liang. Multi-ship collaborative collision avoidance strategy based on multi-agent deep reinforcement learning [J]. Computer Integrated Manufacturing System, 2024, 30(6): 1972-1988.
[5]	HE Huiteng, ZHOU Yong, HU Kaixiong, LI Weidong. Robot multi-action cooperative grasping strategy based on deep reinforcement learning [J]. Computer Integrated Manufacturing System, 2024, 30(5): 1789-1797.
[6]	ZHANG Yuanming, XIAO Shiyi, XU Xuesong, CHENG Zhenbo, XIAO Gang. Health intelligent evaluation based on knowledge graph multi-set pooling [J]. Computer Integrated Manufacturing System, 2024, 30(3): 893-905.
[7]	CHENG Wei, ZHANG Yahui, CAO Xianfeng, JIN Zengzhi, HU Xiaofeng. Deep reinforcement learning algorithm for the type I two-sided assembly line balancing problem [J]. Computer Integrated Manufacturing System, 2024, 30(2): 508-519.
[8]	WANG Meilin, WU Gengfeng, LIANG Kaiqing, LIN Bili. Application of dueling double deep Q-network in real-time scheduling of hybrid flow shop based on MPN [J]. Computer Integrated Manufacturing System, 2024, 30(11): 3929-3942.
[9]	HUANG Yansong, YAO Xifan, JING Xuan, HU Xiaoyang. DQN-based AGV path planning for situations with multi-starts and multi-targets [J]. Computer Integrated Manufacturing System, 2023, 29(8): 2550-2562.
[10]	MA Fengchao, CHEN Siyi, LIU Jin. Benefit optimization method based on cloud federation collaboration mechanism [J]. Computer Integrated Manufacturing System, 2023, 29(7): 2385-2396.
[11]	LI Guoyan, XUE Xiang, LIU Yi, PAN Yuheng. Improved TD3 edge computing offloading strategy for software defined networking Internet of vehicles [J]. Computer Integrated Manufacturing System, 2023, 29(5): 1627-1634.
[12]	XIONG Zhihua, CHEN Hao, WANG Changsheng, YUE Ming, HOU Wenbin, XU Bin. Assembly task allocation of human-robot collaboration based on deep reinforcement learning [J]. Computer Integrated Manufacturing System, 2023, 29(3): 789-800.
[13]	CAI Ze, HU Yaoguang, WEN Jingqian, ZHANG Lixiang. Collision avoidance for AGV based on deep reinforcement learning in complex dynamic environment [J]. Computer Integrated Manufacturing System, 2023, 29(1): 236-245.
[14]	YANG Qisen, WANG Shenzhi, SANG Jinnan, WANG Chaofei, HUANG Gao, WU Cheng, SONG Shiji. Path planning and real-time obstacle avoidance methods of intelligent ships in complex open water environment [J]. Computer Integrated Manufacturing System, 2022, 28(7): 2030-2040.
[15]	. Survey on genetic algorithms for solving flexible job-shop scheduling problem [J]. Computer Integrated Manufacturing System, 2022, 28(2): 536-551.

Improved proximal policy optimization algorithm for solving flexible job shop scheduling problem

改进PPO算法求解柔性作业车间调度问题

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics