CFG-Ctrl:基于控制的免分类器扩散引导 / CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance
1️⃣ 一句话总结
这篇论文提出了一种名为CFG-Ctrl的新框架,它将扩散模型中的免分类器引导技术重新解释为一个控制问题,并引入了一种更稳定、更精准的非线性控制方法,从而显著提升了AI生成图像与文本描述的匹配度。
Classifier-Free Guidance (CFG) has emerged as a central approach for enhancing semantic alignment in flow-based diffusion models. In this paper, we explore a unified framework called CFG-Ctrl, which reinterprets CFG as a control applied to the first-order continuous-time generative flow, using the conditional-unconditional discrepancy as an error signal to adjust the velocity field. From this perspective, we summarize vanilla CFG as a proportional controller (P-control) with fixed gain, and typical follow-up variants develop extended control-law designs derived from it. However, existing methods mainly rely on linear control, inherently leading to instability, overshooting, and degraded semantic fidelity especially on large guidance scales. To address this, we introduce Sliding Mode Control CFG (SMC-CFG), which enforces the generative flow toward a rapidly convergent sliding manifold. Specifically, we define an exponential sliding mode surface over the semantic prediction error and introduce a switching control term to establish nonlinear feedback-guided correction. Moreover, we provide a Lyapunov stability analysis to theoretically support finite-time convergence. Experiments across text-to-image generation models including Stable Diffusion 3.5, Flux, and Qwen-Image demonstrate that SMC-CFG outperforms standard CFG in semantic alignment and enhances robustness across a wide range of guidance scales. Project Page: this https URL
CFG-Ctrl:基于控制的免分类器扩散引导 / CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance
这篇论文提出了一种名为CFG-Ctrl的新框架,它将扩散模型中的免分类器引导技术重新解释为一个控制问题,并引入了一种更稳定、更精准的非线性控制方法,从而显著提升了AI生成图像与文本描述的匹配度。
源自 arXiv: 2603.03281