CoShadow:基于扩散模型的多物体阴影生成用于图像合成 / CoShadow: Multi-Object Shadow Generation for Image Compositing via Diffusion Model
1️⃣ 一句话总结
这篇论文提出了一种名为CoShadow的新方法,利用预训练的扩散模型为图像合成中同时插入的多个前景物体生成物理上合理且相互协调的阴影,解决了现有方法主要针对单物体且难以推广到多物体场景的局限性。
Realistic shadow generation is crucial for achieving seamless image compositing, yet existing methods primarily focus on single-object insertion and often fail to generalize when multiple foreground objects are composited into a background scene. In practice, however, modern compositing pipelines and real-world applications often insert multiple objects simultaneously, necessitating shadows that are jointly consistent in terms of geometry, attachment, and location. In this paper, we address the under-explored problem of multi-object shadow generation, aiming to synthesize physically plausible shadows for multiple inserted objects. Our approach exploits the multimodal capabilities of a pre-trained text-to-image diffusion model. An image pathway injects dense, multi-scale features to provide fine-grained spatial guidance, while a text-based pathway encodes per-object shadow bounding boxes as learned positional tokens and fuses them via cross-attention. An attention-alignment loss further grounds these tokens to their corresponding shadow regions. To support this task, we augment the DESOBAv2 dataset by constructing composite scenes with multiple inserted objects and automatically derive prompts combining object category and shadow positioning information. Experimental results demonstrate that our method achieves state-of-the-art performance in both single and multi-object shadow generation settings.
CoShadow:基于扩散模型的多物体阴影生成用于图像合成 / CoShadow: Multi-Object Shadow Generation for Image Compositing via Diffusion Model
这篇论文提出了一种名为CoShadow的新方法,利用预训练的扩散模型为图像合成中同时插入的多个前景物体生成物理上合理且相互协调的阴影,解决了现有方法主要针对单物体且难以推广到多物体场景的局限性。
源自 arXiv: 2603.02743