通过改进数据-噪声耦合实现基于流的生成模型更快速推理 / Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling
1️⃣ 一句话总结
这篇论文提出了一种名为LOOM-CFM的新方法,通过跨批次优化数据与噪声的配对关系,显著提升了基于流的生成模型的推理速度,同时保持了生成质量,使其在图像和视频生成等任务上更具实用性。
Conditional Flow Matching (CFM), a simulation-free method for training continuous normalizing flows, provides an efficient alternative to diffusion models for key tasks like image and video generation. The performance of CFM in solving these tasks depends on the way data is coupled with noise. A recent approach uses minibatch optimal transport (OT) to reassign noise-data pairs in each training step to streamline sampling trajectories and thus accelerate inference. However, its optimization is restricted to individual minibatches, limiting its effectiveness on large datasets. To address this shortcoming, we introduce LOOM-CFM (Looking Out Of Minibatch-CFM), a novel method to extend the scope of minibatch OT by preserving and optimizing these assignments across minibatches over training time. Our approach demonstrates consistent improvements in the sampling speed-quality trade-off across multiple datasets. LOOM-CFM also enhances distillation initialization and supports high-resolution synthesis in latent space training.
通过改进数据-噪声耦合实现基于流的生成模型更快速推理 / Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling
这篇论文提出了一种名为LOOM-CFM的新方法,通过跨批次优化数据与噪声的配对关系,显著提升了基于流的生成模型的推理速度,同时保持了生成质量,使其在图像和视频生成等任务上更具实用性。
源自 arXiv: 2603.15279