菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-15
📄 Abstract - Seedance 2.0: Advancing Video Generation for World Complexity

Seedance 2.0 is a new native multi-modal audio-video generation model, officially released in China in early February 2026. Compared with its predecessors, Seedance 1.0 and 1.5 Pro, Seedance 2.0 adopts a unified, highly efficient, and large-scale architecture for multi-modal audio-video joint generation. This allows it to support four input modalities: text, image, audio, and video, by integrating one of the most comprehensive suites of multi-modal content reference and editing capabilities available in the industry to date. It delivers substantial, well-rounded improvements across all key sub-dimensions of video and audio generation. In both expert evaluations and public user tests, the model has demonstrated performance on par with the leading levels in the field. Seedance 2.0 supports direct generation of audio-video content with durations ranging from 4 to 15 seconds, with native output resolutions of 480p and 720p. For multi-modal inputs as reference, its current open platform supports up to 3 video clips, 9 images, and 3 audio clips. In addition, we provide Seedance 2.0 Fast version, an accelerated variant of Seedance 2.0 designed to boost generation speed for low-latency scenarios. Seedance 2.0 has delivered significant improvements to its foundational generation capabilities and multi-modal generation performance, bringing an enhanced creative experience for end users.

顶级标签: video generation multi-modal aigc
详细标签: audio-video generation multimodal input content editing generative model creative tools 或 搜索:

Seedance 2.0:面向世界复杂性的视频生成技术进展 / Seedance 2.0: Advancing Video Generation for World Complexity


1️⃣ 一句话总结

Seedance 2.0是一款全新的多模态音视频生成模型,它通过统一的先进架构,能够根据文字、图像、音频和视频等多种输入直接生成高质量的短视频,在生成质量和多模态参考能力上相比前代实现了全面提升。

源自 arXiv: 2604.14148