菜单

🤖 系统
📄 Abstract - LATTICE: Democratize High-Fidelity 3D Generation at Scale

We present LATTICE, a new framework for high-fidelity 3D asset generation that bridges the quality and scalability gap between 3D and 2D generative models. While 2D image synthesis benefits from fixed spatial grids and well-established transformer architectures, 3D generation remains fundamentally more challenging due to the need to predict both spatial structure and detailed geometric surfaces from scratch. These challenges are exacerbated by the computational complexity of existing 3D representations and the lack of structured and scalable 3D asset encoding schemes. To address this, we propose VoxSet, a semi-structured representation that compresses 3D assets into a compact set of latent vectors anchored to a coarse voxel grid, enabling efficient and position-aware generation. VoxSet retains the simplicity and compression advantages of prior VecSet methods while introducing explicit structure into the latent space, allowing positional embeddings to guide generation and enabling strong token-level test-time scaling. Built upon this representation, LATTICE adopts a two-stage pipeline: first generating a sparse voxelized geometry anchor, then producing detailed geometry using a rectified flow transformer. Our method is simple at its core, but supports arbitrary resolution decoding, low-cost training, and flexible inference schemes, achieving state-of-the-art performance on various aspects, and offering a significant step toward scalable, high-quality 3D asset creation.

顶级标签: computer vision model training aigc
详细标签: 3d generation voxel representation rectified flow latent vectors geometry synthesis 或 搜索:

LATTICE:大规模民主化高保真3D生成 / LATTICE: Democratize High-Fidelity 3D Generation at Scale


1️⃣ 一句话总结

这篇论文提出了一个名为LATTICE的新框架,它通过一种创新的半结构化表示方法VoxSet和两阶段生成流程,解决了3D生成模型在质量和扩展性上长期落后于2D模型的难题,使得高效、高质量地大规模创建3D数字资产成为可能。


📄 打开原文 PDF