Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training

📄 Abstract - Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training

Video Instance Segmentation (VIS) faces significant annotation challenges due to its dual requirements of pixel-level masks and temporal consistency labels. While recent unsupervised methods like VideoCutLER eliminate optical flow dependencies through synthetic data, they remain constrained by the synthetic-to-real domain gap. We present AutoQ-VIS, a novel unsupervised framework that bridges this gap through quality-guided self-training. Our approach establishes a closed-loop system between pseudo-label generation and automatic quality assessment, enabling progressive adaptation from synthetic to real videos. Experiments demonstrate state-of-the-art performance with 52.6 $\text{AP}_{50}$ on YouTubeVIS-2019 $\texttt{val}$ set, surpassing the previous state-of-the-art VideoCutLER by 4.4%, while requiring no human annotations. This demonstrates the viability of quality-aware self-training for unsupervised VIS. We will release the code at this https URL.

通过自动质量引导的自训练提升无监督视频实例分割性能 / Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training

1️⃣ 一句话总结

这篇论文提出了一个名为AutoQ-VIS的无监督学习框架，它通过一个自动评估伪标签质量并引导模型自我训练的闭环系统，成功缩小了合成数据与真实视频之间的差距，在无需人工标注的情况下，显著提升了视频中物体识别与分割的准确性。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要