Computer Integrated Manufacturing System ›› 2025, Vol. 31 ›› Issue (8): 2894-2904.DOI: 10.13196/j.cims.2023.0138

Previous Articles     Next Articles

Improved proximal policy optimization algorithm for solving flexible job shop scheduling problem

WU Haoze,LI Yanwu+,XIE Hui   

  1. College of Electronic and Information Engineering,Chongqing Three Gorges University
  • Online:2025-08-31 Published:2025-09-04
  • Supported by:
    Project supported by the Science and Technology Program of Chongqing Municipal Education Commission,China(No.KJQN202001224).

改进PPO算法求解柔性作业车间调度问题

吴昊泽,李艳武+,谢辉   

  1. 重庆三峡学院电子与信息工程学院
  • 作者简介:
    吴昊泽(1997-),男,山东淄博人,硕士研究生,研究方向:智能优化算法,E-mail:247460692@qq.com;

    +李艳武(1985-),男,湖北天门人,高级工程师,博士,硕士生导师,研究方向:流水车间调度问题、智能优化算法、深度学习等,通讯作者,E-mail:liyanwu2022@sina.com;

    谢辉(1969-),女,重庆万州人,教授,硕士生导师,研究方向:智能优化算法,E-mail:20030041@sanxiau.edu.cn。
  • 基金资助:
    重庆市教育委员会科学技术项目研究(KJQN202001224)。

Abstract: The Flexible Job-shop Scheduling Problem (FJSP) needs to improve scheduling efficiency and shorten production cycle time.To minimize the maximum completion time,a mixed integer programming model was established by using two scheduling strategies of machine selection and workpiece process adjustment,and a Deep Reinforcement Learning (DRL) algorithm based on strategy and Graph Neural Network (GNN) was proposed to solve the problem.The algorithm used graph neural network to obtain and analyze the information of disjunctive graph,which provided decision-making basis for reinforcement learning.The algorithm using Multiple proximal Policy Optimization (Multi-PPO) and Multi-pointer Graph Network (MPGN) was proposed to learn the operation action strategy and machine action strategy.Two encoder-decoders were designed to define two action strategies,and the graph neural network was embedded into the local state to enhance the local search ability.The experimental results showed that the proposed algorithm was significantly better than the comparison algorithm in solving performance and generalization ability.

Key words: deep reinforcement learning, flexible job-shop scheduling problem, disjunctive graph, graph neural network, multiple proximal policy optimization

摘要: 柔性作业车间需要提高调度效率,缩短生产周期。为此以最小化最大完工时间为目标,采用机器选择和工件工序调整两种调度策略,建立混合整数规划模型,并提出了一种基于策略和图神经网络的深度强化学习算法来求解。该算法利用图神经网络获取和分析析取图的信息,为强化学习提供决策依据;提出了使用多近端策略优化和多指针图网络学习作业操作动作策略和机器动作策略,设计两个编码器-解码器定义两个动作策略,并将图神经网络嵌入到局部状态,增强局部搜索能力。实验结果表明所提出的算法在求解性能和泛化能力方面显著优于对比算法。

关键词: 深度强化学习, 柔性作业车间调度问题, 析取图, 图神经网络, 多近端策略优化算法

CLC Number: