基于马氏决策过程的易逝品联合策略

doi:10.13196/j.cims.2017.01.016

计算机集成制造系统 ›› 2017, Vol. 23 ›› Issue (第1期): 144-153.DOI: 10.13196/j.cims.2017.01.016

基于马氏决策过程的易逝品联合策略

郑江波,程福阳,杨柳

暨南大学管理学院

出版日期:2017-01-31 发布日期:2017-01-31
基金资助:
广东省自然科学基金资助项目(2016Z00052)。

Jointed decisions for perishable product with Markov decision process

Online:2017-01-31 Published:2017-01-31
Supported by:
Project supported by the Natural Science Foundation of Guangdong Province,China(No.2016Z00052).

摘要/Abstract

摘要： 为了有效解决零售商在销售易逝品时的订货、旧产品处理及定价的联合决策问题,提出运用马氏决策过程建立模型及使用Q学习算法求得最优策略。最优策略包括各个状态下选择的决策动作,它能使从现在起及后续无限期的贴现总值为最大。算法中的迭代公式通过不断与环境进行互动并得到反馈,时刻更新最优策略。基于有限的状态集和动作集,在状态转移概率及当期期望收益未知的情况下,算法经过长时间学习后能够得到稳定的最优策略。研究发现,各参数(变化)对联合策略中各策略的特征有不同的影响,该结论为启发式策略的相关研究提供了一定的理论支持和解决思路。

关键词: 易逝品, 马氏决策过程, Q学习算法, 订货策略, 定价策略

Abstract: To solve the jointed decisions problem of ordering,pricing and old products disposing faced for selling perishable products with a multi-period shelf life over an infinite horizon effectively,a model with Markov decision theory was established and the optimal policy was computed by using Q-learning algorithm.The optimal policy indicated the action of all states which could maximize the long-run discounted expected profit from current period.Through interacting with the environment and obtaining the feedback continuously,the iterate formula of algorithm renewed the optimal policy constantly.The stationary optimal policy would be computed after sufficient learning under situation of state and action space were finite and discrete,while the state transition probability and expected profit were not necessarily be known.The research showed that the different parameters had different and significant impact on the characteristic of each decision,and the conclusion provided some support and thought for researches of heuristic strategy.

Key words: perishable product, Markov decision process, Q-learning algorithm, ordering decisions, pricing decisions

中图分类号:

F272
F275

郑江波,程福阳,杨柳. 基于马氏决策过程的易逝品联合策略[J]. 计算机集成制造系统, 2017, 23(第1期): 144-153.

[1]	易余胤,韩桂兰. 不同渠道效率下电商平台基于顾客行为的定价策略[J]. 计算机集成制造系统, 2021, 27(11): 3341-3355.
[2]	朱文兴,谢明珠,许菱. 基于双边市场理论的云制造平台定价策略[J]. 计算机集成制造系统, 2020, 26(第1): 268-278.
[3]	许民利,王竟竟. 模糊需求下基于CVaR的供应链定价与协调[J]. 计算机集成制造系统, 2020, 26(8): 2266-2277.
[4]	姚锋敏,刘珊,孙嘉轶,滕春贤. 第三方回收闭环供应链的广告与定价决策模型[J]. 计算机集成制造系统, 2019, 25(第9): 2395-2404.
[5]	赵文燕,任祥宇. 基于前景理论的易逝品溢价预售策略[J]. 计算机集成制造系统, 2018, 24(第6): 1531-1541.
[6]	黄祖庆,孟丽君,张宝友,杨玉香. 基于新品价值评估可变的产品差异定价[J]. 计算机集成制造系统, 2018, 24(第3): 793-803.
[7]	周健,石萍,唐哲宇. 基于搭便车现象的双渠道定价策略[J]. 计算机集成制造系统, 2016, 22(第4期): 1119-1128.
[8]	孙浩,胡劲松,王磊,钟永光,达庆利. 考虑参考效应的两期闭环供应链定价策略与回收模式比较[J]. 计算机集成制造系统, 2016, 22(第12期): 2875-2887.
[9]	但斌，，毛迎迎，徐广业. 基于RFID技术的易逝品供应链库存协调[J]. , 2012, 18(10): 0-0.
[10]	孙多青,，马晓英. 基于博弈论的多零售商参与下逆向供应链定价策略及利润分配[J]. , 2012, 18(04): 0-0.
[11]	曹柬,，吴晓波，周根贵. 基于产品效用异质性的绿色供应链协调策略[J]. , 2011, 17(06): 0-0.
[12]	陈军，，但斌，邱晗光，曹群辉. 基于期量折扣的变质产品订货策略[J]. , 2009, 15(09): 0-0.
[13]	贾俊秀. 2-2供应链网络下生产能力和定价策略研究[J]. , 2008, 14(10): 0-0.
[14]	孙晟，王世进，奚立峰. 基于强化学习的模式驱动调度系统研究[J]. , 2007, 13(09): 0-0.
[15]	顾巧论，，季建华，高铁杠，石连栓. 有固定需求底线的逆向供应链定价策略研究[J]. , 2005, 11(12): 0-0.

基于马氏决策过程的易逝品联合策略

Jointed decisions for perishable product with Markov decision process

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics