计算机集成制造系统 ›› 2018, Vol. 24 ›› Issue (第7): 1698-1705.DOI: 10.13196/j.cims.2018.07.011

• 当期目次 • 上一篇    下一篇

实时分析工作流调度方法

姚艳,曹健+,田晓亮   

  1. 上海交通大学计算机科学与工程系
  • 出版日期:2018-07-31 发布日期:2018-07-31
  • 基金资助:
    国家重点研发计划(2018YFB1003800);国家自然科学基金资助项目(61472253,61772334);上海交通大学医工交叉资助项目(YG2015MS61)。

Scheduling of real time analytics workflows

  • Online:2018-07-31 Published:2018-07-31
  • Supported by:
    Project supported by the National Key Research and Development Plan,China(No.2018YFB1003800),the National Natural Science Foundation,China (No.61472253,61772334),and the Cross Research Fund of Biomedical Engineering of Shanghai Jiaotong University,China (No.YG2015MS61).

摘要: 针对吞吐量约束下费用最小化的实时分析工作流调度问题,提出了两阶段的启发式优化算法。首先,利用排队论理论对分析工作流的吞吐量进行建模,并求出每一个分析任务满足吞吐量条件需要部署分数的下界。然后,将问题规约成装箱问题,利用启发式算法找到近似最优解。在实验阶段,以违章车辆分析流程为例,在阿里云平台上对算法进行了验证。实验结果表明所提算法在保证吞吐量的同时,花费相比于列表调度算法更低。

关键词: 大数据分析, 云计算, 分析流程, 调度算法

Abstract: Aiming at the resource scheduling for real time analytics workflow with minimum monetary cost under throughput capacity constraints,a two-stage heuristic optimization algorithm was proposed.The queuing theory was used to model the throughput of analytics workflow,and each lower bound for satisfying throughput condition of analytic task was obtained.The heuristic algorithm was applied to find approximate optimal solution for the problems which standardized as container problems.By taking car plate recognition workflow as a case study,the proposed algorithm was performed in Ali cloud.The results showed that our proposed algorithm was effective in monetary cost while meeting the requirement of throughput.

Key words: big data analysis, cloud computing, analytical workflow, scheduling algorithm

中图分类号: