📄
Abstract - JFAA: Technical Report for the EPIC-KITCHENS-100 Action Anticipation Challenge at EgoVis 2026
We propose JFAA, a JEPA-based Future Action Anticipation method for the EPIC-KITCHENS-100 (EK-100) Action Anticipation task. Inspired by the representation learning and future prediction ability of V-JEPA 2.1, JFAA uses a frozen encoder and predictor to extract observed context features and near-future latent tokens. A lightweight attentive probe is then trained to predict verb, noun, and action logits with separate task queries. To improve robustness, we further build a field-aware ensemble over selected epoch-level predictions, allowing each output field to benefit from its most reliable candidates. Experimental results on the official challenge server show that JFAA achieves first place in the EgoVis 2026 EK-100 Action Anticipation Challenge. Our code will be released at this https URL.
JFAA:EgoVis 2026 EPIC-KITCHENS-100动作预测挑战赛技术报告 /
JFAA: Technical Report for the EPIC-KITCHENS-100 Action Anticipation Challenge at EgoVis 2026
1️⃣ 一句话总结
本文提出了一种基于V-JEPA模型的轻量级动作预测方法(JFAA),通过冻结预训练编码器提取视频特征,并利用注意力机制分别预测动作的动词、名词和整体动作标签,最终在EPIC-KITCHENS-100动作预测挑战赛中取得第一名。