通过自适应原点引导实现对编辑模型的连续控制 / Continuous Control of Editing Models via Adaptive-Origin Guidance
1️⃣ 一句话总结
这篇论文提出了一种名为AdaOr的新方法,它能让用户像调节音量一样平滑地控制AI对图片或视频的编辑强度,解决了现有模型要么不改、要么全改的‘跳跃式’编辑问题。
Diffusion-based editing models have emerged as a powerful tool for semantic image and video manipulation. However, existing models lack a mechanism for smoothly controlling the intensity of text-guided edits. In standard text-conditioned generation, Classifier-Free Guidance (CFG) impacts prompt adherence, suggesting it as a potential control for edit intensity in editing models. However, we show that scaling CFG in these models does not produce a smooth transition between the input and the edited result. We attribute this behavior to the unconditional prediction, which serves as the guidance origin and dominates the generation at low guidance scales, while representing an arbitrary manipulation of the input content. To enable continuous control, we introduce Adaptive-Origin Guidance (AdaOr), a method that adjusts this standard guidance origin with an identity-conditioned adaptive origin, using an identity instruction corresponding to the identity manipulation. By interpolating this identity prediction with the standard unconditional prediction according to the edit strength, we ensure a continuous transition from the input to the edited result. We evaluate our method on image and video editing tasks, demonstrating that it provides smoother and more consistent control compared to current slider-based editing approaches. Our method incorporates an identity instruction into the standard training framework, enabling fine-grained control at inference time without per-edit procedure or reliance on specialized datasets.
通过自适应原点引导实现对编辑模型的连续控制 / Continuous Control of Editing Models via Adaptive-Origin Guidance
这篇论文提出了一种名为AdaOr的新方法,它能让用户像调节音量一样平滑地控制AI对图片或视频的编辑强度,解决了现有模型要么不改、要么全改的‘跳跃式’编辑问题。
源自 arXiv: 2602.03826