菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-20
📄 Abstract - View-Consistent 3D Scene Editing via Dual-Path Structural Correspondense and Semantic Continuity

Text-driven 3D scene editing has recently attracted increasing attention. Most existing methods follow a render-edit-optimize pipeline, where multi-view images are rendered from a 3D scene, edited with 2D image editors, and then used to optimize the underlying 3D representation. However, cross-view inconsistency remains a major bottleneck. Although recent methods introduce geometric cues, cross-view interactions, or video priors to mitigate this issue, they still largely rely on inference-time synchronization and thus remain limited in robustness and this http URL this work, we recast multi-view consistent 3D editing from a distributional perspective: 3D scene editing essentially requires a joint distribution modeling across this http URL on this insight, we propose a view-consistent 3D editing framework that explicitly introduces cross-view dependencies into the editing process. Furthermore, motivated by the observation that structural correspondence and semantic continuity rely on different cross-view cues, we introduce a dual-path consistency mechanism consisting of projection-guided structural guidance and patch-level semantic propagation for effective cross-view editing. Further, we construct a paired multi-view editing dataset that provides reliable supervision for learning cross-view consistency in edited scenes. Extensive experiments demonstrate that our method achieves superior editing performance with precise and consistent views for complex scenes.

顶级标签: computer vision multi-modal model training
详细标签: 3d scene editing view consistency multi-view synthesis text-driven editing structural correspondence 或 搜索:

基于双路径结构对应与语义连续性的视角一致三维场景编辑 / View-Consistent 3D Scene Editing via Dual-Path Structural Correspondense and Semantic Continuity


1️⃣ 一句话总结

这篇论文提出了一种新的三维场景编辑方法,通过引入双路径一致性机制来确保从不同角度观看编辑后的三维场景时,内容在结构和语义上都能保持一致,解决了现有方法中多视角不一致的核心难题。

源自 arXiv: 2604.17801