用于舞蹈指纹识别的量化结构保持运动表征学习 / Learning Quantised Structure-Preserving Motion Representations for Dance Fingerprinting
1️⃣ 一句话总结
这篇论文提出了一个名为DANCEMATCH的端到端框架,它能将舞蹈视频中的动作转化为紧凑、可解释的‘数字指纹’,从而高效地从海量视频中快速检索出动作相似的舞蹈片段。
We present DANCEMATCH, an end-to-end framework for motion-based dance retrieval, the task of identifying semantically similar choreographies directly from raw video, defined as DANCE FINGERPRINTING. While existing motion analysis and retrieval methods can compare pose sequences, they rely on continuous embeddings that are difficult to index, interpret, or scale. In contrast, DANCEMATCH constructs compact, discrete motion signatures that capture the spatio-temporal structure of dance while enabling efficient large-scale retrieval. Our system integrates Skeleton Motion Quantisation (SMQ) with Spatio-Temporal Transformers (STT) to encode human poses, extracted via Apple CoMotion, into a structured motion vocabulary. We further design DANCE RETRIEVAL ENGINE (DRE), which performs sub-linear retrieval using a histogram-based index followed by re-ranking for refined matching. To facilitate reproducible research, we release DANCETYPESBENCHMARK, a pose-aligned dataset annotated with quantised motion tokens. Experiments demonstrate robust retrieval across diverse dance styles and strong generalisation to unseen choreographies, establishing a foundation for scalable motion fingerprinting and quantitative choreographic analysis.
用于舞蹈指纹识别的量化结构保持运动表征学习 / Learning Quantised Structure-Preserving Motion Representations for Dance Fingerprinting
这篇论文提出了一个名为DANCEMATCH的端到端框架,它能将舞蹈视频中的动作转化为紧凑、可解释的‘数字指纹’,从而高效地从海量视频中快速检索出动作相似的舞蹈片段。
源自 arXiv: 2604.00927