菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-30
📄 Abstract - World2Minecraft: Occupancy-Driven Simulated Scenes Construction

Embodied intelligence requires high-fidelity simulation environments to support perception and decision-making, yet existing platforms often suffer from data contamination and limited flexibility. To mitigate this, we propose World2Minecraft to convert real-world scenes into structured Minecraft environments based on 3D semantic occupancy prediction. In the reconstructed scenes, we can effortlessly perform downstream tasks such as Vision-Language Navigation(VLN). However, we observe that reconstruction quality heavily depends on accurate occupancy prediction, which remains limited by data scarcity and poor generalization in existing models. We introduce a low-cost, automated, and scalable data acquisition pipeline for creating customized occupancy datasets, and demonstrate its effectiveness through MinecraftOcc, a large-scale dataset featuring 100,165 images from 156 richly detailed indoor scenes. Extensive experiments show that our dataset provides a critical complement to existing datasets and poses a significant challenge to current SOTA methods. These findings contribute to improving occupancy prediction and highlight the value of World2Minecraft in providing a customizable and editable platform for personalized embodied AI research. Project page:this https URL.

顶级标签: computer vision agents data
详细标签: embodied ai 3d occupancy prediction scene reconstruction simulation environment dataset generation 或 搜索:

世界到我的世界:基于占据预测的模拟场景构建 / World2Minecraft: Occupancy-Driven Simulated Scenes Construction


1️⃣ 一句话总结

本文提出了一种名为World2Minecraft的方法,能够将真实世界的场景自动转换为《我的世界》中的结构化三维环境,从而为具身智能研究(如视觉语言导航)提供一个低成本、可定制且易于编辑的高保真模拟平台,并为此构建了一个大规模三维占据预测数据集MinecraftOcc来提升场景重建的准确性。

源自 arXiv: 2604.27578