Hallucination in World Models is Predictable and Preventable

📄 Abstract - Hallucination in World Models is Predictable and Preventable

Modern generative world models render increasingly realistic action-controllable futures, yet they frequently hallucinate: rollouts remain visually fluent while drifting from the ground-truth dynamics. We hypothesize that hallucination concentrates in low-coverage regions of the state-action space, where lightweight data-centric signals can both detect it and guide mitigation. To test this, we introduce MMBench2, a 427-hour, 210-task dataset for visual world modeling with ground-truth actions, rewards, and live simulators, and train a 350M-parameter world model on it. We identify three distinct hallucination modes: perceptual, action-marginalized, and scene-diverging -- each anchored to a different stage of the pipeline, and develop three signals that accurately predict where the model will fail. To close coverage gaps at training time, we develop a coverage-aware sampling technique; to close them online, our hallucination predictors serve as curiosity rewards for targeted data collection, yielding a data-efficient finetuning recipe that adapts the pretrained world model to entirely unseen environments with as few as 50 real environment trajectories. Overall, our findings reveal that hallucination in world models is inherently a data coverage issue, and that the same signals used to detect it can also be used for mitigation. An interactive web version of our paper is available at this https URL

世界模型中的幻觉是可预测且可预防的 / Hallucination in World Models is Predictable and Preventable

1️⃣ 一句话总结

本文发现，生成式世界模型产生的“幻觉”（即生成的未来场景看似逼真但偏离真实动态）主要出现在训练数据覆盖不足的状态-动作空间中，并提出了三种基于数据覆盖率的信号来精准预测模型何处会出错，同时利用这些信号指导采样策略与数据收集，仅需少量真实轨迹即可高效修复模型在全新环境中的幻觉问题。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要