›› 2019, Vol. 25 ›› Issue (第8): 2000-2006.DOI: 10.13196/j.cims.2019.08.014
Previous Articles Next Articles
Online:
Published:
Supported by:
张小俊,李辰政,孙凌宇,张明路
基金资助:
Abstract: In view of the problems of too large video stream data and too many setting 3D convolution kernel parameters in human behavior based on video,which led to long training time and difficulty in tuning the parameters,a method to divide 3D convolution kernel into two kinds of convolution kernels that were space domain and time domain was proposed based on 3D convolutional neural network.Two data streams formed by two convolution kernels were interact with each other,thus the network structure and reducing parameter settings were optimizated.The training verification was performed on two behavioral identification datasets named KTH and UCF101 and the accuracy rate of recognition behavior was 96.2% and 90.7% respectively.The results showed that the proposed method could speed up the training progress by 7.5%~7.8% and ensure the training accuracy at the same time.Therefore,this method could effectively reduce the hardware requirements for deep learning in behavior recognition and improve the efficiency of model training,which could be widely used in the field of intelligent robots.
Key words: behavior recognition, 3D convolutional neural network, residual network, dual flow data, deep learning theory
摘要: 鉴于基于视频的人体行为识别中的视频流数据过于庞大,3D卷积核参数设置过多,存在训练时间较长,调参困难等问题,以3D卷积神经网络为基础,提出一种将3D卷积核拆分成空间域和时间域两种卷积核的神经网络结构。两种卷积核分别形成两个数据流进行交互,同时引入残差网络以优化网络结构,减少参数设置。将所提方法应用于两个行为识别数据集KTH和UCF101上进行训练验证,其行为识别准确率分别为96.2%和90.7%。结果表明,较改进前的神经网络框架,所提方法在保证动作识别准确度的前提下,训练速度提高了7.5%~7.8%。该方法可以有效降低深度学习进行行为识别的硬件要求,提高模型训练效率,并可以广泛应用于智能机器人领域。
关键词: 行为识别, 3D卷积神经网络, 残差网络, 双数据流, 深度学习理论
CLC Number:
TP242.6
张小俊,李辰政,孙凌宇,张明路. 基于改进3D卷积神经网络的行为识别[J]. 计算机集成制造系统, 2019, 25(第8): 2000-2006.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.cims-journal.cn/EN/10.13196/j.cims.2019.08.014
http://www.cims-journal.cn/EN/Y2019/V25/I第8/2000