菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-02
📄 Abstract - SkeleGuide: Explicit Skeleton Reasoning for Context-Aware Human-in-Place Image Synthesis

Generating realistic and structurally plausible human images into existing scenes remains a significant challenge for current generative models, which often produce artifacts like distorted limbs and unnatural poses. We attribute this systemic failure to an inability to perform explicit reasoning over human skeletal structure. To address this, we introduce SkeleGuide, a novel framework built upon explicit skeletal reasoning. Through joint training of its reasoning and rendering stages, SkeleGuide learns to produce an internal pose that acts as a strong structural prior, guiding the synthesis towards high structural integrity. For fine-grained user control, we introduce PoseInverter, a module that decodes this internal latent pose into an explicit and editable format. Extensive experiments demonstrate that SkeleGuide significantly outperforms both specialized and general-purpose models in generating high-fidelity, contextually-aware human images. Our work provides compelling evidence that explicitly modeling skeletal structure is a fundamental step towards robust and plausible human image synthesis.

顶级标签: computer vision aigc model training
详细标签: human image synthesis skeletal reasoning pose guidance context-aware generation structural prior 或 搜索:

SkeleGuide:基于显式骨骼推理的上下文感知人物场景图像合成 / SkeleGuide: Explicit Skeleton Reasoning for Context-Aware Human-in-Place Image Synthesis


1️⃣ 一句话总结

这篇论文提出了一个名为SkeleGuide的新框架,它通过显式地对人体骨骼结构进行推理,并利用一个可编辑的姿势模块,有效解决了现有模型在将人物合成到现有场景时经常出现的肢体扭曲和姿势不自然的问题,从而生成了结构更合理、更逼真的人物图像。

源自 arXiv: 2603.01579