DreamStyle:一种统一的视频风格化框架 / DreamStyle: A Unified Framework for Video Stylization
1️⃣ 一句话总结
这篇论文提出了一个名为DreamStyle的统一框架,它能同时支持文本、参考图片和首帧引导的视频风格化,并通过创新的数据构建和模型训练方法,有效解决了现有方法风格不一致和画面闪烁的问题,显著提升了视频质量和风格一致性。
Video stylization, an important downstream task of video generation models, has not yet been thoroughly explored. Its input style conditions typically include text, style image, and stylized first frame. Each condition has a characteristic advantage: text is more flexible, style image provides a more accurate visual anchor, and stylized first frame makes long-video stylization feasible. However, existing methods are largely confined to a single type of style condition, which limits their scope of application. Additionally, their lack of high-quality datasets leads to style inconsistency and temporal flicker. To address these limitations, we introduce DreamStyle, a unified framework for video stylization, supporting (1) text-guided, (2) style-image-guided, and (3) first-frame-guided video stylization, accompanied by a well-designed data curation pipeline to acquire high-quality paired video data. DreamStyle is built on a vanilla Image-to-Video (I2V) model and trained using a Low-Rank Adaptation (LoRA) with token-specific up matrices that reduces the confusion among different condition tokens. Both qualitative and quantitative evaluations demonstrate that DreamStyle is competent in all three video stylization tasks, and outperforms the competitors in style consistency and video quality.
DreamStyle:一种统一的视频风格化框架 / DreamStyle: A Unified Framework for Video Stylization
这篇论文提出了一个名为DreamStyle的统一框架,它能同时支持文本、参考图片和首帧引导的视频风格化,并通过创新的数据构建和模型训练方法,有效解决了现有方法风格不一致和画面闪烁的问题,显著提升了视频质量和风格一致性。
源自 arXiv: 2601.02785