菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-16
📄 Abstract - Real-Time Human Frontal View Synthesis from a Single Image

Photorealistic human novel view synthesis from a single image is crucial for democratizing immersive 3D telepresence, eliminating the need for complex multi-camera setups. However, current rendering-centric methods prioritize visual fidelity over explicit geometric understanding and struggle with intricate regions like faces and hands, leading to temporal instability. Meanwhile, human-centric frameworks suffer from memory bottlenecks since they typically rely on an auxiliary model to provide informative structural priors for geometric modeling, which limits real-time performance. To address these challenges, we propose PrismMirror, a geometry-guided framework for instant frontal view synthesis from a single image. By avoiding external geometric modeling and focusing on frontal view synthesis, our model optimizes visual integrity for telepresence. Specifically, PrismMirror introduces a novel cascade learning strategy that enables coarse-to-fine geometric feature learning. It first directly learns coarse geometric features, such as SMPL-X meshes and point clouds, and then refines textures through rendering supervision. To achieve real-time efficiency, we distill this unified framework into a lightweight linear attention model. Notably, PrismMirror is the first monocular human frontal view synthesis model that achieves real-time inference at 24 FPS, significantly outperforming previous methods in both visual authenticity and structural accuracy.

顶级标签: computer vision systems multi-modal
详细标签: novel view synthesis human frontal view real-time rendering geometry-guided single image 或 搜索:

基于单张图像的实时人体正面视图合成 / Real-Time Human Frontal View Synthesis from a Single Image


1️⃣ 一句话总结

这篇论文提出了一个名为PrismMirror的新框架,它能够仅用一张照片就实时生成高质量的人体正面视图,解决了以往方法在真实感、几何结构准确性和运行速度上难以兼顾的问题。

源自 arXiv: 2603.15433