菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-04
📄 Abstract - Glass Segmentation with Fusion of Learned and General Visual Features

Glass surface segmentation from RGB images is a challenging task, since glass as a transparent material distinctly lacks visual characteristics. However, glass segmentation is critical for scene understanding and robotics, as transparent glass surfaces must be identified as solid material. This paper presents a novel architecture for glass segmentation, deploying a dual-backbone producing general visual features as well as task-specific learned visual features. General visual features are produced by a frozen DINOv3 vision foundation model, and the task-specific features are generated with a Swin model trained in a supervised manner. Resulting multi-scale feature representations are downsampled with residual Squeeze-and-Excitation Channel Reduction, and fed into a Mask2Former Decoder, producing the final segmentation masks. The architecture was evaluated on four commonly used glass segmentation datasets, achieving state-of-the-art results on several accuracy metrics. The model also has a competitive inference speed compared to the previous state-of-the-art method, and surpasses it when using a lighter DINOv3 backbone variant. The implementation source code and model weights are available at: this https URL

顶级标签: computer vision model training systems
详细标签: glass segmentation transparent objects multi-scale features mask2former dino foundation model 或 搜索:

融合学习特征与通用视觉特征的玻璃分割方法 / Glass Segmentation with Fusion of Learned and General Visual Features


1️⃣ 一句话总结

这篇论文提出了一种新的玻璃分割方法,通过结合一个通用视觉模型和一个专门训练的模型的特征,能更准确地识别图像中透明的玻璃表面,在多个测试集上取得了目前最好的效果,并且运行速度也很快。

源自 arXiv: 2603.03718