Variance Reduction on the Camera Axis: Multi-View Score Distillation for 3D

📄 Abstract - Variance Reduction on the Camera Axis: Multi-View Score Distillation for 3D

Score distillation turns a pretrained 2D diffusion model into a 3D generator, but the per-step gradient is estimated from a single randomly chosen view: it is high-variance and blind to global shape consistency. Prior work addresses this by retraining the diffusion prior on multi-view data; this improves consistency but makes the sampling contribution inseparable from prior quality. We instead isolate the sampling axis. The per-step gradient is one noisy sample of an expectation over views; aggregating K samples per step at a fixed total UNet budget reduces variance without touching the prior. We introduce Multi-View Aggregated Score Distillation (MV-SDI), which aggregates gradients from K views per step via gradient accumulation, keeping peak memory unchanged and the 2D prior frozen, and draws views as antithetic antipodal pairs, a prior-independent geometric property, for balanced angular coverage. At a fixed 10,000-UNet-call budget, K=2 raises CLIP R-Precision from 74.8% to 83.8% and CLIP score from 0.297 to 0.312, with consistent gains on HPSv2 and ImageReward and a 0.0% divergence rate on the 43-prompt benchmark; optimization steps halve as a consequence. K=4 gives a fourfold step reduction at R-Precision 86.9% and CLIP 0.307, still well above the single-view baseline on every alignment metric. MV-SDI is compatible with gradient-based score-distillation pipelines, including Score Distillation via Inversion, and requires no retraining and no multi-view data.

相机轴上的方差缩减：多视角分数蒸馏用于3D生成 / Variance Reduction on the Camera Axis: Multi-View Score Distillation for 3D

1️⃣ 一句话总结

本文提出了一种无需重新训练或使用多视角数据的方法，通过在一个固定的UNet计算预算内，对多个随机视角的梯度进行聚合（如成对选取视角），有效降低了2D扩散模型蒸馏为3D生成器时的梯度方差，从而显著提升生成质量、对齐得分并减少优化步骤。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要