计算机集成制造系统 ›› 2017, Vol. 23 ›› Issue (第5期): 1060-1068.DOI: 10.13196/j.cims.2017.05.017

• 产品创新开发技术 • 上一篇    下一篇

基于序列移动距离的用户行为挖掘与相似度计算

林泽东1,鲁法明1,段华2+   

  1. 1.山东科技大学信息科学与工程学院
    2.山东科技大学数学与系统科学学院
  • 出版日期:2017-05-31 发布日期:2017-05-31
  • 基金资助:
    国家自然科学基金资助项目(61602279,61170079,61202152,61472229);山东省科技发展计划资助项目(2014GGX101035,2016ZDJS02A11);山东省自然科学基金资助项目(BS2014DX013,ZR2015FM013);同济大学嵌入式系统与服务计算教育部重点实验室开放课题资助项目(ESSCKF201403);山东省博士后创新专项资金资助项目(201603056);山东科技大学领军人才与优秀科研团队计划资助项目(2015TDJH102)。

User behavior mmining and similarity computation based on sequence mover's distance

  • Online:2017-05-31 Published:2017-05-31
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.61602279,61170079,61202152,61472229),the Science & Technology Development Fund of Shandong Province,China(No.2014GGX101035,2016ZDJS02A11),the Shandong Provincial Natural Science Foundation,China(No.BS2014DX013,ZR2015FM013),the Open Project Foundation of Key Laboratory of Embedded System and Service Computing of Ministry of Education in Tongji University,China(No.ESSCKF201403),the Shandong Provincial Postdoctoral Innovation Project,China(No.201603056),and the SDUST Research Fund,China(No.2015TDJH102).

摘要: 为了对用户的行为进行相似性度量,从用户行为流程产生的行为序列出发计算用户行为的相似度。将推土机距离算法应用到用户行为相似度的计算领域,提出用户行为相似度计算的序列移动距离方法。首先定义了基于最长公共子序列的用户行为序列距离度量方法;其次定义了用户行为序列多重集之间距离的度量指标,在此基础上提出用户行为相似度计算的SMD方法;最后提出行为序列多重集之间距离度量应遵循的基本准则。在人工和真实数据集上进行了实验,实验结果表明了所提方法的有效性。

关键词: 用户行为挖掘, 用户行为相似度, 相似性度量, EMD距离, 序列移动距离

Abstract: To measure the similarity of user behavior,the user behavior similarity was calculated from the perspective of behavior sequence generated by user's behavior process.Specifically,Sequence Mover's Distance (SMD) method was proposed by applying Earth Mover's Distance (EMD) algorithm to the field of measuring user behavior similarity.The method to measure user behavior sequence distance based on longest common subsequence was defined;then the metrics distance between multiple sets for user behavior sequence was defined,and SMD method to measure user behavior similarity was proposed;some basic principles to measure the distance between multiple sets for user behavior sequence were proposed.Experiments on both artificial and real data sets demonstrated the effectiveness of the proposed method.

Key words: user behavior mining, user behavior similarity, similarity measure, earth mover's distance, sequence mover's distance

中图分类号: