菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-24
📄 Abstract - KidRisk: Benchmark Dataset for Children Dangerous Action Recognition

Children are naturally energetic, and during their spontaneous activities, they often encounter potentially dangerous situations, especially when lacking parental supervision. Identifying actions that pose risks plays a crucial role in ensuring their safety. This paper build a novel challenging dataset, namely KidRisk, including 2,500 short videos of children's actions and 10,000 images for dangerous action of children. We also introduce a benchmark on our newly constructs dataset and find that traditional deep learning models demonstrated limited effectiveness on these tasks. Therefore, we develop vision-language based baselines with exceptional context understanding of visual information. Our proposed methods achieved an accuracy of 83.53% in classifying children's actions and 96.14% in recognizing children's dangerous actions, significantly outperforming traditional approaches. These results confirm that vision-language models are not only feasible but also highly effective in detecting hazardous actions, contributing positively to safeguarding children's safety.

顶级标签: computer vision video benchmark
详细标签: action recognition dangerous action children safety vision-language model dataset 或 搜索:

KidRisk:儿童危险动作识别的基准数据集 / KidRisk: Benchmark Dataset for Children Dangerous Action Recognition


1️⃣ 一句话总结

该论文构建了一个包含2500个短视频和10000张图片的儿童危险动作数据集KidRisk,并提出了基于视觉语言模型的方法,在儿童动作分类和危险动作识别上分别达到了83.53%和96.14%的准确率,为儿童安全监测提供了高效可行的技术方案。

源自 arXiv: 2606.25298