计算机集成制造系统 ›› 2014, Vol. 20 ›› Issue (12): 3000-3010.DOI: 10.13196/j.cims.2014.12.010

• 产品创新开发技术 • 上一篇    下一篇

知识化制造环境中基于双层Q学习的航空发动机自适应装配调度

汪浩祥1,2,严洪森1,3,汪峥1,3   

  1. 1.东南大学自动化学院
    2.南京农业大学工学院
    3.东南大学复杂工程系统测量与控制教育部重点实验室
  • 出版日期:2014-12-31 发布日期:2014-12-31
  • 基金资助:
    国家自然科学基金重点资助项目(60934008);国家自然科学基金资助项目(71401076,71101072);东南大学优秀博士论文基金资助项目(YBJJ1215)。

Adaptive assembly scheduling of aero-engine based on double-layer Q-learning in knowledgeable manufacturing

  • Online:2014-12-31 Published:2014-12-31
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.60934008,71401076,71101072),and the Scientific Research Foundation of Graduate School of Southeast University,China(No.YBJJ1215).

摘要: 针对不确定生产环境下航空发动机装配的自适应调度问题,结合强化学习的实时性特点和知识化制造系统的自适应特征,提出用于解决航空发动机装配问题的双层Q学习方法。上层Q学习着眼于局部,学习合适的分派规则并将作业分配到并行机器,从而最小化设备空闲和平衡机器负荷;下层Q学习着眼于全局,学习最优的调度策略用来对分配到机器的工序进行调度,以最小化作业整体提前期。采用基于函数逼近的Q(λ)学习方法对值函数进行更新,通过合理地定义强化学习问题三大要素:动作、状态和回报函数,将航空发动机自适应装配调度问题转化为强化学习问题。仿真实验结果表明,通过在上下两层适时选取调度规则,采用双层Q学习方法比单层Q学习在总体上具有更好的优势,调度结果远优于单个规则,显示出了良好的自适应性能。

关键词: 知识化制造, 航空发动机, 装配, 自适应调度, Q学习

Abstract: Aiming at the adaptive scheduling problem of aero-engine assembly in uncertain production environment,by combining the real-time feature of reinforcement learning with the self-adaptation characteristics of knowledgeable manufacture system,a Double-layer Q-learning(D-Q)method was proposed for the aero-engine assembly problem.The top level of Q-learning focused on the local target of minimizing machine idleness and balancing machine loads by finding the proper dispatching policy for the job allocation on parallel machines;the bottom level of Q-learning focused on the global target of minimizing the all jobs'overall earliness by learning the optimal scheduling policy for the jobs that allocated to machines.A function approximation Q(λ)learning method was used to update the value function.By defining the three key elements of action,state and reward function rationally in reinforcement learning,the adaptive assembly scheduling problem of aero-engine was transformed into the reinforcement learning problem.Simulation experiments indicated that the proposed D-Q algorithm could achieve better results than the Single-layer Q-learning(S-Q)by selecting scheduling rules timely in two layers.The scheduling result was outperformed the single rule,and the good adaptive performance was exhibited.

Key words: knowledgeable manufacturing, aero-engine assembly, adaptive scheduling, Q-learning

中图分类号: