Local Spatiotemporal Convolutional Network for Robust Gait Recognition

📄 Abstract - Local Spatiotemporal Convolutional Network for Robust Gait Recognition

Gait recognition, as a promising biometric technology, identifies individuals through their unique walking patterns and offers distinctive advantages including non-invasiveness, long-range applicability, and resistance to deliberate disguise. Despite these merits, capturing the intrinsic motion patterns concealed within consecutive video frames remains challenging due to the complexity of video data and the interference of external covariates such as viewpoint changes, clothing variations, and carrying conditions. Existing approaches predominantly rely on either static appearance features extracted from individual silhouette frames or employ complex sequential models (\eg, LSTM, 3D convolutions) that demand substantial computational resources and sophisticated training strategies. To address these limitations, we propose a Local Spatiotemporal Convolutional Network (LSTCN), a structurally simple yet highly effective dual-branch architecture that endows standard two-dimensional convolutional networks with the capacity to extract temporal information. Specifically, we introduce a Global Bidirectional Spatial Pooling (GBSP) mechanism that reduces the dimensionality of gait tensors by decomposing spatial features into horizontal and vertical strip-based local representations, enabling the temporal dimension to participate in standard 2D convolution operations. Building upon this, we design a Local Spatiotemporal Convolutional (LSTC) layer that jointly processes temporal and spatial dimensions, allowing the network to adaptively learn strip-based gait motion patterns. We further extend this formulation with asymmetric convolution kernels that independently attend to the temporal, spatial, and joint spatiotemporal domains, thereby enriching the extracted feature representations.

局部时空卷积网络：实现鲁棒的步态识别 / Local Spatiotemporal Convolutional Network for Robust Gait Recognition

1️⃣ 一句话总结

本文提出一种结构简单但高效的局部时空卷积网络（LSTCN），通过将步态视频中的时空特征分解为水平和垂直条带，并设计专门的局部时空卷积层，使普通二维卷积网络也能有效学习到时间维度的运动模式，从而在复杂场景下实现更鲁棒、计算成本更低的步态识别。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要