JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning

📄 Abstract - JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning

Diffusion world models have recently become competitive for online model-based reinforcement learning, but current approaches expose a tension: pixel diffusion is effective but computationally expensive while the latest latent diffusion approach improves efficiency yet performs subpar. The latter also relies on separately trained latents rather than the end-to-end world-model objectives that have driven much of modern MBRL progress. In particular, JEPA-style predictive representation learning has emerged as an especially promising direction for world modeling and MBRL. Concurrently, diffusion-style objectives have gained traction across multiple domains, with iterative refinement as a promising approach for multimodal and stochastic targets. Taken together, these trends motivate Joint Embedding DIffusion (JEDI), the first online end-to-end latent diffusion world model. JEDI learns its latent space directly from the diffusion denoising loss with a JEPA framework, using denoising to learn and predict future latents rather than relying on reconstruction and pretrained models. We provide a theoretical motivation showing that conventional JEPA objectives induce a predictive information bottleneck, and that conditional diffusion denoising admits a closely related predictive-compression decomposition. Empirically, JEDI is competitive on Atari100k and outperforms the baseline with seperately trained latents where directly comparable. Relative to the pixel diffusion baseline, JEDI uses 43% less VRAM, over 3$\times$ faster world-model sampling, and 2.5$\times$ faster training. JEDI also exhibits a markedly different task-level performance profile from the pixel baseline, suggesting that end-to-end predictive latents change more than compute alone.

JEDI：面向在线基于模型强化学习的联合嵌入扩散世界模型 / JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning

1️⃣ 一句话总结

本文提出JEDI，一种端到端训练的潜在扩散世界模型，通过将扩散去噪损失与JEPA预测性表征学习框架结合，在在线强化学习中既大幅降低了计算成本（显存减少43%、采样速度提升3倍以上），又在Atari100k任务上达到与像素级扩散模型相当甚至更优的性能。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要