菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-09
📄 Abstract - ManiSplat: Manipulation Trajectory Synthesis from Monocular Video via Decoupled 3D Gaussian Splatting

Reconstructing dynamic and interactive 3D scenes from real-world observations remains a fundamental challenge in computer vision and robotics. While recent advances in 3D Gaussian Splatting have enabled high-fidelity static reconstruction, extending it to interactive environments with articulated robots and manipulable objects remains difficult due to complex contact interactions and abrupt pose changes. To address these challenges, we introduce ManiSplat, a unified framework that reconstructs controllable and decoupled Gaussian digital twins directly from monocular ego-view robotic videos. Our method introduces a Graph-Structured Disentangled Representation that separates the robot, objects, and background into independently optimizable Gaussian subfields organized within a scene graph. To ensure stability, we propose a Task-Oriented Spatio-Temporal Alignment module that leverages the inherent logic of manipulation tasks-alternating between Motion and Skill phases-to construct accurate pseudo-ground-truth trajectories. Finally, a joint photometric-geometric optimization ensures the reconstructed scenes are temporally coherent, physically consistent, and simulation-ready. Extensive experiments demonstrate that our approach reconstructs interaction-driven dynamic scenes with high fidelity and controllability, effectively supporting downstream robotic tasks and policy learning.

顶级标签: computer vision robotics 3d gaussian splatting
详细标签: dynamic scene reconstruction manipulation trajectory synthesis disentangled representation monocular video robotic interaction 或 搜索:

ManiSplat:通过解耦三维高斯泼溅从单目视频生成操作轨迹 / ManiSplat: Manipulation Trajectory Synthesis from Monocular Video via Decoupled 3D Gaussian Splatting


1️⃣ 一句话总结

本文提出一种名为ManiSplat的框架,能从单个机器人视角的单目视频中,自动将场景中的机器人、物体和背景分离并重建为可交互的三维数字模型,进而合成准确的操作轨迹,支持模拟和机器人策略学习。

源自 arXiv: 2606.10645