迁移蚁群强化学习算法及其在矩形排样中的应用

doi:10.13196/j.cims.2020.12.006

计算机集成制造系统 ›› 2020, Vol. 26 ›› Issue (12): 3236-3247.DOI: 10.13196/j.cims.2020.12.006

迁移蚁群强化学习算法及其在矩形排样中的应用

徐小斐¹,陈婧²,饶运清¹,孟荣华^1,4+,袁博³,罗强¹

1.华中科技大学机械科学与工程学院
2.贵州交通职业技术学院汽车工程系
3.武汉理工大学机电工程学院
4.三峡大学机械与动力学院

出版日期:2020-12-31 发布日期:2020-12-31
基金资助:
国家自然科学基金资助项目(51975231);中央高校基本科研业务费专项资金资助项目(2019kfyXKJC043)。

Transfer ants reinforcement learning algorithm and its application on rectangular packing problem

Online:2020-12-31 Published:2020-12-31
Supported by:
Project supported by the National Natural Science Foundation,China(No.51975231),and the Fundamental Research Funds for the Central Universities,China(No.2019kfyXKJC043).

摘要/Abstract

摘要： 矩形排样是典型的NP-Hard问题,当零件数量增加时,求解时间便会呈指数倍急剧增长。为缩减相似排样任务的计算时间,提高寻优性能与材料利用率,结合基于匹配度评价的最低水平线算法,提出基于知识迁移的蚁群强化学习算法,以解决矩形排样问题。该算法针对高维知识空间,构建基于知识延伸的高维空间合并矩阵,并借助强化学习“试错”学习模式,在知识矩阵中利用有自学习能力的蚁群完成知识的获取与更新。而后将“预学习”获得的知识利用线性迁移策略迁移给目标任务,指导其在线快速做出决策。通过算例仿真表明：该算法能获得较高质量的解,同时寻优速度达到其他智能算法的2~6倍,在求解大中规模矩形排样问题上具有较好的实用性。

关键词: 矩形排样, 蚁群算法, 强化学习, 知识迁移

Abstract: Rectangular packing problem is a typical NP-Hard problem,the solution time will increase exponentially with increasing of parts' number.To reduce the computing time of similar tasks and improve the optimization performance and material utilization,combined with the lowest skyline search algorithm based on fitness evaluation factor,a novel ants reinforcement learning algorithm based on knowledge transfer was proposed for rectangular packing problem.Aiming at the high-dimensional knowledge space,this algorithm constructed a high-dimensional space combination matrix based on knowledge extension.With the help of “trial-and-error” learning mode of reinforcement learning,the algorithm acquired and updated knowledge in the knowledge matrix by using ant colony with self-learning ability.The knowledge acquired by pre-learning was transferred to the target task by linear transfer strategy,which helped the new task make decisions quickly online.Simulation result showed that the proposed algorithm could obtain a higher quality solution at the speed of 2-6 times faster than other intelligent algorithm,which was very suitable in solving large and medium-scale rectangular packing problem.

Key words: rectangular packing problem, ant colony algorithm, reinforcement learning, knowledge transfer

中图分类号:

TP391

徐小斐,陈婧,饶运清,孟荣华,袁博,罗强. 迁移蚁群强化学习算法及其在矩形排样中的应用[J]. 计算机集成制造系统, 2020, 26(12): 3236-3247.

[1]	冯春,张祎伟,黄成,姜文彪,武之炜. 双足机器人步态控制的深度强化学习方法[J]. 计算机集成制造系统, 2021, 27(8): 2341-2349.
[2]	崔鹏浩,王军强,张文沛,李洋. 基于深度强化学习的流水线预测性维护决策[J]. 计算机集成制造系统, 2021, 27(12): 3416-3428.
[3]	陈勇,王昊天,易文超,裴植,王成,吴光华. 基于元胞机与强化学习的多扰动车间调度算法[J]. 计算机集成制造系统, 2021, 27(12): 3536-3549.
[4]	张韵,钟慧超,张春江,李新宇,丛建臣. 基于机器学习的多策略并行遗传算法[J]. 计算机集成制造系统, 2021, 27(10): 2921-2928.
[5]	肖鹏飞,张超勇,孟磊磊,洪辉,戴稳. 基于深度强化学习的非置换流水车间调度问题[J]. 计算机集成制造系统, 2021, 27(1): 193-206.
[6]	徐兴,钱誉钦,赵芸,张云,陈小依,吕晓姝. 基于改进蚁群算法的立体仓库三维空间路径优化[J]. 计算机集成制造系统, 2021, 27(1): 207-214.
[7]	张景玲,冯勤炳,赵燕伟,刘金龙,冷龙龙. 基于强化学习的超启发算法求解有容量车辆路径问题[J]. 计算机集成制造系统, 2020, 26(第4): 1118-1129.
[8]	张玉茹,王晨旸. 引入变异算子的改进贪心和蚁群混合算法[J]. 计算机集成制造系统, 2020, 26(第3): 860-870.
[9]	李锋,陈勇,王家序,汤宝平. 基于强化学习单元匹配循环神经网络的滚动轴承状态趋势预测[J]. 计算机集成制造系统, 2020, 26(8): 2050-2059.
[10]	周翔,许茂增,吕奇光,李丹. 基于客户点行政地址的自提点选址—路径优化[J]. 计算机集成制造系统, 2019, 25(第8): 2069-2078.
[11]	白建龙,陈瀚宁,胡亚宝,何茂伟,梁晓丹,PARK Dongwon. 基于负反馈机制的蚁群算法及其在机器人路径规划中的应用[J]. 计算机集成制造系统, 2019, 25(第7): 1767-1774.
[12]	黎继子,张念,刘春玲. 基于全渠道设计的众包供应链订单生产决策优化[J]. 计算机集成制造系统, 2019, 25(第5): 1248-1258.
[13]	夏金,孙宏波,孙立民. 基于强化学习的生产再决策问题[J]. 计算机集成制造系统, 2019, 25(第11): 2935-2942.
[14]	陈友玲,刘舰,凌磊,王龙. 基于协同效应的并行制造云服务组合算法[J]. 计算机集成制造系统, 2019, 25(第1): 137-146.
[15]	范厚明,杨翔,李荡,李阳,刘鹏程,吴嘉鑫. 基于生鲜品多中心联合配送的半开放式车辆路径问题[J]. 计算机集成制造系统, 2019, 25(第1): 256-266.

迁移蚁群强化学习算法及其在矩形排样中的应用

Transfer ants reinforcement learning algorithm and its application on rectangular packing problem

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics