← 返回列表

arXiv 提交日期: 2026-05-20

📄 Abstract - JFAA: Technical Report for the EPIC-KITCHENS-100 Action Anticipation Challenge at EgoVis 2026

We propose JFAA, a JEPA-based Future Action Anticipation method for the EPIC-KITCHENS-100 (EK-100) Action Anticipation task. Inspired by the representation learning and future prediction ability of V-JEPA 2.1, JFAA uses a frozen encoder and predictor to extract observed context features and near-future latent tokens. A lightweight attentive probe is then trained to predict verb, noun, and action logits with separate task queries. To improve robustness, we further build a field-aware ensemble over selected epoch-level predictions, allowing each output field to benefit from its most reliable candidates. Experimental results on the official challenge server show that JFAA achieves first place in the EgoVis 2026 EK-100 Action Anticipation Challenge. Our code will be released at this https URL.

顶级标签: computer vision video model training

JFAA：EgoVis 2026 EPIC-KITCHENS-100动作预测挑战赛技术报告 / JFAA: Technical Report for the EPIC-KITCHENS-100 Action Anticipation Challenge at EgoVis 2026

1️⃣ 一句话总结

本文提出了一种基于V-JEPA模型的轻量级动作预测方法（JFAA），通过冻结预训练编码器提取视频特征，并利用注意力机制分别预测动作的动词、名词和整体动作标签，最终在EPIC-KITCHENS-100动作预测挑战赛中取得第一名。

👋 没兴趣 ☆ 感兴趣 📌 待读

打开原文 PDF

源自 arXiv: 2605.20904

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要