Is Your Driving World Model an All-Around Player?

📄 Abstract - Is Your Driving World Model an All-Around Player?

Today's driving world models can generate remarkably realistic dash-cam videos, yet no single model excels universally. Some generate photorealistic textures but violate basic physics; others maintain geometric consistency but fail when subjected to closed-loop planning. This disconnect exposes a critical gap: the field evaluates how real generated worlds appear, but rarely whether they behave realistically. We introduce WorldLens, a unified benchmark that measures world-model fidelity across the full spectrum, from pixel quality and 4D geometry to closed-loop driving and human perceptual alignment, through five complementary aspects and 24 standardized dimensions. Our evaluation of six representative models reveals that no existing approach dominates across all axes: texture-rich models violate geometry, geometry-aware models lack behavioral fidelity, and even the strongest performers achieve only 2-3 out of 10 on human realism ratings. To bridge algorithmic metrics with human perception, we further contribute WorldLens-26K, a 26,808-entry human-annotated preference dataset pairing numerical scores with textual rationales, and WorldLens-Agent, a vision-language evaluator distilled from these judgments that enables scalable, explainable auto-assessment. Together, the benchmark, dataset, and agent form a unified ecosystem for assessing generated worlds not merely by visual appeal, but by physical and behavioral fidelity.

你的驾驶世界模型是全能选手吗？ / Is Your Driving World Model an All-Around Player?

1️⃣ 一句话总结

本文指出当前驾驶世界模型虽然能生成逼真的行车视频，但都存在片面性——有的画质好却违反物理规律，有的几何准确但无法用于闭环规划，因此作者提出了一个名为WorldLens的统一基准测试，从像素质量、4D几何、闭环驾驶性能到人类感知评价等多个维度全面评估模型，并配套了一个包含2.6万条人类偏好标注的数据集和一个可解释的自动化评估AI，帮助研究者不仅看视频好不好看，更要看模型模拟的世界是否真实可信。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要