计算机集成制造系统 ›› 2025, Vol. 31 ›› Issue (12): 4556-4565.DOI: 10.13196/j.cims.2024.Z04

• • 上一篇    下一篇

基于高阶时空自注意力的开集动作识别算法

张泽群,李书婷,曹其立,唐敦兵+,季宇辰   

  1. 南京航空航天大学机电学院
  • 出版日期:2025-12-31 发布日期:2026-01-08
  • 作者简介:
    张泽群(1991-),男,安徽安庆人,副研究员,博士,硕士生导师,研究方向:智能制造系统、人机协作,E-mail:zhjj370@nuaa.edu.cn;

    李书婷(2002-),女,安徽六安人,硕士研究生,研究方向:计算机视觉,E-mail:lshuting@nuaa.edu.cn;

    曹其立(2000-),男,浙江台州人,硕士研究生,研究方向:计算机视觉,E-mail:caoqili@nuaa.edu.cn;

    +唐敦兵(1972-),男,湖北仙桃人,教授,博士,博士生导师,研究方向:智能制造系统,通讯作者,E-mail:d.tang@nuaa.edu.cn;

    季宇辰(1999-),男,江苏盐城人,硕士研究生,研究方向:计算机视觉、人机协作,E-mail:jiyuchen@nuaa.edu.cn。
  • 通讯作者简介:唐敦兵(1972-),男,湖北仙桃人,教授,博士,博士生导师,研究方向:智能制造系统,通讯作者,E-mail:d.tang@nuaa.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(52305539,92267109);江苏省自然科学基金资助项目(BK20230880);中央高校基本科研业务费资助项目(NS2024033)。

Open set action recognition algorithm based on high-order spatial-temporal self-attention

ZHANG Zequn,LI Shuting,CAO Qili,TANG Dunbing+,JI Yuchen   

  1. College of Mechanical & Electrical Engineering,Nanjing University of Aeronautics and Astronautics
  • Online:2025-12-31 Published:2026-01-08
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.52305539,92267109),the Natural Science Foundation of Jiangsu Province,China(No.BK20230880),and the Fundamental Research Funds for the Central Universities,China(No.NS2024033).

摘要: 随着人机协作技术的快速发展,人体动作识别在复杂环境中面临诸多挑战。现有的动作识别方法主要针对封闭集进行优化,缺乏对开放集环境中未知类别的有效识别。为了应对这一挑战,提出了一种基于高阶时空自注意力机制的开集动作识别算法。该算法结合了空间多头自注意力(S-MHSA)和时间多头自注意力(T-MHSA)模块,增强了模型对长时依赖性和全局上下文信息的捕捉能力。通过引入OpenMax层,计算样本与均值激活向量(MAV)之间的距离分布,成功区分了已知类别与未知类别,从而提高了开放环境下的识别准确性。最后,在公开数据集和自定义数据集上进行了广泛实验。结果显示,改进算法相比原算法具有更高的准确率和开放集识别能力。同时,在现实场景里验证了其应用价值。

关键词: 动作识别, 开放环境, 时空图卷积网络, 骨架序列, 深度学习

Abstract: With the rapid development of human-robot collaboration technology,human action recognition faces several challenges in complex environments.Existing action recognition methods are mainly optimized for closed set scenarios and lack effective recognition of unknown categories in open set environments.To address this challenge,an open set action recognition algorithm based on higher order spatial-temporal self-attention mechanism was proposed,which integrated Spatial Multi-Head Self Attention(S-MHSA)and Temporal Multi-Head Self Attention(T-MHSA)modules for enhancing the model's ability to capture long-term dependencies and global contextual information.On the other hand,the OpenMax layer was introduced to compute the distance distribution between the sample and the mean activation vector,which enabled the algorithm to distinguish between known and unknown categories,improving recognition accuracy in open environments.Extensive experiments were conducted on both public and custom datasets.The results showed that the proposed algorithm outperformed the original algorithm in terms of accuracy and open set recognition capability.Furthermore,its practical application value was validated in real-world scenarios.

Key words: action recognition, open environment, spatiotemporal graph convolutional network, skeleton sequence, deep learning

中图分类号: