全视剪辑:基于镜头查询Transformer的整体关系型镜头边界检测 / OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer
1️⃣ 一句话总结
本文提出了一种名为OmniShotCut的新方法,它利用镜头查询Transformer将视频切分任务转化为对镜头内部和镜头之间关系的整体预测,能够更准确地识别各种镜头切换,并解决现有方法在边界模糊、微小错误和训练数据不足方面的缺陷。
Shot Boundary Detection (SBD) aims to automatically identify shot changes and divide a video into coherent shots. While SBD was widely studied in the literature, existing state-of-the-art methods often produce non-interpretable boundaries on transitions, miss subtle yet harmful discontinuities, and rely on noisy, low-diversity annotations and outdated benchmarks. To alleviate these limitations, we propose OmniShotCut to formulate SBD as structured relational prediction, jointly estimating shot ranges with intra-shot relations and inter-shot relations, by a shot query-based dense video Transformer. To avoid imprecise manual labeling, we adopt a fully synthetic transition synthesis pipeline that automatically reproduces major transition families with precise boundaries and parameterized variants. We also introduce OmniShotCutBench, a modern wide-domain benchmark enabling holistic and diagnostic evaluation.
全视剪辑:基于镜头查询Transformer的整体关系型镜头边界检测 / OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer
本文提出了一种名为OmniShotCut的新方法,它利用镜头查询Transformer将视频切分任务转化为对镜头内部和镜头之间关系的整体预测,能够更准确地识别各种镜头切换,并解决现有方法在边界模糊、微小错误和训练数据不足方面的缺陷。
源自 arXiv: 2604.24762