菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-05
📄 Abstract - Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching

Flow matching has recently emerged as a promising alternative to diffusion-based generative models, particularly for text-to-image generation. Despite its flexibility in allowing arbitrary source distributions, most existing approaches rely on a standard Gaussian distribution, a choice inherited from diffusion models, and rarely consider the source distribution itself as an optimization target in such settings. In this work, we show that principled design of the source distribution is not only feasible but also beneficial at the scale of modern text-to-image systems. Specifically, we propose learning a condition-dependent source distribution under flow matching objective that better exploit rich conditioning signals. We identify key failure modes that arise when directly incorporating conditioning into the source, including distributional collapse and instability, and show that appropriate variance regularization and directional alignment between source and target are critical for stable and effective learning. We further analyze how the choice of target representation space impacts flow matching with structured sources, revealing regimes in which such designs are most effective. Extensive experiments across multiple text-to-image benchmarks demonstrate consistent and robust improvements, including up to a 3x faster convergence in FID, highlighting the practical benefits of a principled source distribution design for conditional flow matching.

顶级标签: model training aigc computer vision
详细标签: flow matching source distribution text-to-image generative models conditional generation 或 搜索:

更好的源,更好的流:为流匹配学习条件依赖的源分布 / Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching


1️⃣ 一句话总结

这篇论文提出,在文本生成图像的流匹配模型中,学习一个根据文本条件变化的源分布,而不是使用固定的高斯分布,可以显著提升模型性能,实现更快的收敛和更好的生成效果。

源自 arXiv: 2602.05951