Optimal repositioning of driverless taxi under uncertain demand

doi:10.13196/j.cims.2022.11.009

Computer Integrated Manufacturing System ›› 2022, Vol. 28 ›› Issue (11): 3433-3442.DOI: 10.13196/j.cims.2022.11.009

Previous Articles Next Articles

Optimal repositioning of driverless taxi under uncertain demand

ZHOU Xiaoting¹,WU Lubin¹,ZHANG Yu²,JIANG Shancheng¹⁺

1.School of Intelligent Systems Engineering,Sun Yat-sen University
2.School of Business Administration,Southwestern University of Finance and Economics

Online:2022-11-30 Published:2022-12-08
Supported by:
Project supported by the National Key Research and Development Program,China (No.2020YFB1713800),the National Natural Science Foundation,China(No.71901180,71801031),and the Guangdong Provincial Basic and Applied Basic Research Foundation,China(No.2019A1515011962).

基于不确定需求的无人驾驶出租车优化调度

周晓婷¹,吴禄彬¹,章宇²,姜善成¹⁺

1.中山大学智能工程学院
2.西南财经大学工商管理学院

基金资助:
国家重点研发计划资助项目(2020YFB1713800);国家自然科学基金资助项目(71901180,71801031);广东省基础与应用基础研究基金资助项目(2019A1515011962)。

Abstract

Abstract: To reduce the amount of empty taxies and make passengers more easily to take a taxi in peak hours,a model-free deep reinforcement learning framework was proposed to dispatch driverless taxi under uncertain demand.The framework comprehensively considered the benefit of service providers as well as the waiting cost of customers.A well-designed Twin Delayed Deep Deterministic policy gradient (TD3) algorithm was introduced to optimize the problem and allocate resources.The simulator was built based on real taxi trip data from New York.To improve the robustness of the algorithm,uncertain demands were added to the training process.The experimental results showed that the algorithm could make non-shortsighted and effective strategy under uncertain demand.

Key words: reinforcement learning, driverless taxi, vehicle repositioning, policy gradient

摘要： 为了减少乘客在高峰期打车难和出租车空载的情况,面对不确定的出行需求,提出一个无模型深度强化学习框架,以解决无人驾驶出租车调度问题。该框架使用马尔可夫决策模型进行建模,综合考虑了运营商收益与顾客等待成本,使用基于策略的深度强化学习算法——双延迟深度确定性策略梯度算法(TD3)对无人驾驶出租车进行调度,达到合理分配空闲车辆资源的目的。基于纽约市的真实出租车出行数据搭建了环境模拟器,通过在训练过程中加入不确定需求来增强算法鲁棒性。实验结果证明了该方法在求解不确定需求下的无人驾驶出租车调度问题时的有效性。

关键词: 强化学习, 无人驾驶出租车, 车辆调度, 策略梯度

CLC Number:

TP301
U9

ZHOU Xiaoting, WU Lubin, ZHANG Yu, JIANG Shancheng. Optimal repositioning of driverless taxi under uncertain demand[J]. Computer Integrated Manufacturing System, 2022, 28(11): 3433-3442.

周晓婷, 吴禄彬, 章宇, 姜善成. 基于不确定需求的无人驾驶出租车优化调度[J]. 计算机集成制造系统, 2022, 28(11): 3433-3442.

[1]	HUANG Zizhao, ZHUANG Zilong, TENG Hao, QIN Wei, QIN Tao, ZOU Ying. Optimization of outbound container space assignment in automated container terminals based on hyper-heuristic algorithm [J]. Computer Integrated Manufacturing System, 2022, 28(8): 2619-2632.
[2]	YANG Qisen, WANG Shenzhi, SANG Jinnan, WANG Chaofei, HUANG Gao, WU Cheng, SONG Shiji. Path planning and real-time obstacle avoidance methods of intelligent ships in complex open water environment [J]. Computer Integrated Manufacturing System, 2022, 28(7): 2030-2040.
[3]	. Q-learning based hyper-heuristic algorithm for solving multi-mode resource-constrained project scheduling problem [J]. Computer Integrated Manufacturing System, 2022, 28(5): 1472-1481.
[4]	. Dynamic scheduling method of distributed photovoltaic operation and maintenance resources based on reinforcement learning [J]. Computer Integrated Manufacturing System, 2022, 28(2): 552-563.
[5]	YAN Hanzhi, XU Xiaolong, DAI Fei, QI Lianyong, DOU Wanchun, LI Tong. Edge server deployment strategy with reinforcement learning in Internet of vehicles [J]. Computer Integrated Manufacturing System, 2022, 28(10): 3146-3155.
[6]	CHEN Qiaoxin, LU Yu, LIN Bing, WANG Suyun, SHAO Xun. Real-time scheduling strategy for reasoning tasks in vehicle edge computing [J]. Computer Integrated Manufacturing System, 2022, 28(10): 3295-3303.
[7]	. Deep reinforcement learning method for biped robot gait control [J]. , 2021, 27(8): 2341-2349.
[8]	. Predictive maintenance decision-making for serial production lines based on deep reinforcement learning [J]. , 2021, 27(12): 3416-3428.
[9]	. Scheduling algorithm for multi-disturbance job-shop based on cellular automata and reinforcement learning [J]. , 2021, 27(12): 3536-3549.
[10]	. Multi-strategy parallel genetic algorithm based on machine learning [J]. , 2021, 27(10): 2921-2928.
[11]	. Non-permutation flow shop scheduling problem based on deep reinforcement learning [J]. Computer Integrated Manufacturing System, 2021, 27(1): 192-205.
[12]	. Hyper-heuristic for CVRP with reinforcement learning [J]. , 2020, 26(第4): 1118-1129.
[13]	. State trend prediction of rolling bearing based on reinforcement learning unit matching recurrent neural network [J]. , 2020, 26(8): 2050-2059.
[14]	. Transfer ants reinforcement learning algorithm and its application on rectangular packing problem [J]. , 2020, 26(12): 3236-3247.
[15]	. Reinforcement learning for production reschedule [J]. , 2019, 25(第11): 2935-2942.

Optimal repositioning of driverless taxi under uncertain demand

基于不确定需求的无人驾驶出租车优化调度

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics