菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-03
📄 Abstract - SemGS: Feed-Forward Semantic 3D Gaussian Splatting from Sparse Views for Generalizable Scene Understanding

Semantic understanding of 3D scenes is essential for robots to operate effectively and safely in complex environments. Existing methods for semantic scene reconstruction and semantic-aware novel view synthesis often rely on dense multi-view inputs and require scene-specific optimization, limiting their practicality and scalability in real-world applications. To address these challenges, we propose SemGS, a feed-forward framework for reconstructing generalizable semantic fields from sparse image inputs. SemGS uses a dual-branch architecture to extract color and semantic features, where the two branches share shallow CNN layers, allowing semantic reasoning to leverage textural and structural cues in color appearance. We also incorporate a camera-aware attention mechanism into the feature extractor to explicitly model geometric relationships between camera viewpoints. The extracted features are decoded into dual-Gaussians that share geometric consistency while preserving branch-specific attributes, and further rasterized to synthesize semantic maps under novel viewpoints. Additionally, we introduce a regional smoothness loss to enhance semantic coherence. Experiments show that SemGS achieves state-of-the-art performance on benchmark datasets, while providing rapid inference and strong generalization capabilities across diverse synthetic and real-world scenarios.

顶级标签: computer vision systems model training
详细标签: 3d reconstruction semantic scene understanding novel view synthesis gaussian splatting sparse view 或 搜索:

SemGS:基于稀疏视图的前馈式语义3D高斯泼溅用于可泛化的场景理解 / SemGS: Feed-Forward Semantic 3D Gaussian Splatting from Sparse Views for Generalizable Scene Understanding


1️⃣ 一句话总结

这项研究提出了一种名为SemGS的新方法,它能够仅用少数几张照片就快速重建出带有物体类别信息的3D场景模型,并能从任意新角度生成清晰的语义分割图,大大提升了机器人在复杂环境中理解场景的效率和实用性。

源自 arXiv: 2603.02548