📄
Abstract - EvoDriveVLA: Evolving Autonomous Driving Vision-Language-Action Model via Collaborative Perception-Planning Distillation
Vision-Language-Action models have shown great promise for autonomous driving, yet they suffer from degraded perception after unfreezing the visual encoder and struggle with accumulated instability in long-term planning. To address these challenges, we propose EvoDriveVLA-a novel collaborative perception-planning distillation framework that integrates self-anchored perceptual constraints and oracle-guided trajectory optimization. Specifically, self-anchored visual distillation leverages self-anchor teacher to deliver visual anchoring constraints, regularizing student representations via trajectory-guided key-region awareness. In parallel, oracle-guided trajectory distillation employs a future-aware oracle teacher with coarse-to-fine trajectory refinement and Monte Carlo dropout sampling to produce high-quality trajectory candidates, thereby selecting the optimal trajectory to guide the student's prediction. EvoDriveVLA achieves SOTA performance in open-loop evaluation and significantly enhances performance in closed-loop evaluation. Our code is available at: this https URL.
EvoDriveVLA:通过协同感知-规划蒸馏进化的自动驾驶视觉-语言-动作模型 /
EvoDriveVLA: Evolving Autonomous Driving Vision-Language-Action Model via Collaborative Perception-Planning Distillation
1️⃣ 一句话总结
这篇论文提出了一种名为EvoDriveVLA的新方法,通过结合‘自我锚定’的视觉约束和‘先知引导’的轨迹优化,协同训练自动驾驶模型,有效解决了模型在长期规划中感知能力下降和决策不稳定的问题,从而显著提升了自动驾驶系统的性能。