Computer Integrated Manufacturing System ›› 2025, Vol. 31 ›› Issue (11): 3979-3989.DOI: 10.13196/j.cims.2023.0474

Previous Articles     Next Articles

Flexible job shop scheduling based on deep self-learning tabu search algorithm

ZENG Lingming,DING Linshan,GUAN Zailin+   

  1. School of Mechanical Science and Engineering,Huazhong University of Science and Technology
  • Online:2025-11-30 Published:2025-12-04
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.51905196).

基于深度自学习禁忌搜索的柔性作业车间调度

曾令铭,丁林山,管在林+   

  1. 华中科技大学机械科学与工程学院
  • 作者简介:
    曾令铭(1999-),江西上饶人,硕士研究生,研究方向:车间调度、智能优化算法,E-mail:zenglingming@hust.edu.cn;

    丁林山(1995-),河南濮阳人,博士研究生,研究方向:深度强化学习、智能优化算法、调度优化,E-mail:linshan_ding@hust.edu.cn;

    +管在林(1966-),江苏扬州人,教授,研究方向:高级计划排程与生产运作优化、制造系统建模仿真与物流分析、数字化工厂,通讯作者。E-mail:zlguan@hust.edu.cn。
  • 基金资助:
    国家自然科学基金青年科学基金项目(51905196)。

Abstract: To solve the dynamic adjustment problem of key parameters of flexible job-shop solving algorithm,a Deep Self-Learning Tabu Search algorithm(DSLTS)based on deep reinforcement learning was proposed,which taking the Tabu Search algorithm(TS)as the basic optimization method,and the Double Deep Q Network(DDQN)was used to intelligently adjust the key parameters of TS algorithm.The self-learning model of DSLTS algorithm was analyzed and established.Long Short Term Memory(LSTM)network was used to fit the multiple TS algorithm feature vectors,and the results were input into DDQN network for learning iteration.The state feature vector and reward function of reinforcement learning under TS algorithm were designed.The effectiveness and performance of other common FJSP solving algorithms and DSLTS algorithm in solving FJSP problems were compared,and the effectiveness of the proposed model and method was verified.

Key words: flexible job-shop scheduling problem, deep reinforcement learningmodel, self-learning model, tabu search algorithm

摘要: 为解决柔性作业车间调度求解算法关键参数动态调整问题,提出一种基于深度强化学习的自学习禁忌搜索算法(DSLTS)。该算法以禁忌搜索算法(TS)为基础优化方法,并采用双层深度Q网络(DDQN)智能调整TS算法关键参数。首先,分析并建立DSLTS算法中的自学习模型,利用长短期记忆(LSTM)网络拟合多条TS算法特征向量,将结果输入DDQN网络中进行学习迭代。然后,设计了TS算法环境下强化学习的状态特征向量和奖励函数。最后,比较其他求解FJSP问题常见算法和DSLTS算法在求解FJSP问题时的求解效果和性能,验证所提模型和方法的有效性。

关键词: 柔性车间调度问题, 深度强化学习, 自学习模型, 禁忌搜索算法

CLC Number: