FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

📄 Abstract - FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

In this work, we show that the impact of model capacity varies across timesteps: it is crucial for the early and late stages but largely negligible during the intermediate stage. Accordingly, we propose FlowBlending, a stage-aware multi-model sampling strategy that employs a large model and a small model at capacity-sensitive stages and intermediate stages, respectively. We further introduce simple criteria to choose stage boundaries and provide a velocity-divergence analysis as an effective proxy for identifying capacity-sensitive regions. Across LTX-Video (2B/13B) and WAN 2.1 (1.3B/14B), FlowBlending achieves up to 1.65x faster inference with 57.35% fewer FLOPs, while maintaining the visual fidelity, temporal coherence, and semantic alignment of the large models. FlowBlending is also compatible with existing sampling-acceleration techniques, enabling up to 2x additional speedup. Project page is available at: this https URL.

FlowBlending：面向阶段感知的多模型采样策略，用于快速且高保真的视频生成 / FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation

1️⃣ 一句话总结

这篇论文提出了一种名为FlowBlending的智能采样方法，它根据视频生成过程中不同阶段对模型能力需求不同的特点，巧妙地组合使用大模型和小模型，从而在保持高质量生成效果的同时，大幅提升了生成速度并减少了计算开销。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要