PCA-VAE:无需码书坍缩的可微分子空间量化 / PCA-VAE: Differentiable Subspace Quantization without Codebook Collapse
1️⃣ 一句话总结
这篇论文提出了一种名为PCA-VAE的新模型,它用简单、可微分的在线主成分分析替代了传统向量量化,从而在图像重建质量更高、使用比特数更少的同时,避免了码书坍陷等问题,并能自动学习到具有可解释性的语义特征。
Vector-quantized autoencoders deliver high-fidelity latents but suffer inherent flaws: the quantizer is non-differentiable, requires straight-through hacks, and is prone to collapse. We address these issues at the root by replacing VQ with a simple, principled, and fully differentiable alternative: an online PCA bottleneck trained via Oja's rule. The resulting model, PCA-VAE, learns an orthogonal, variance-ordered latent basis without codebooks, commitment losses, or lookup noise. Despite its simplicity, PCA-VAE exceeds VQ-GAN and SimVQ in reconstruction quality on CelebAHQ while using 10-100x fewer latent bits. It also produces naturally interpretable dimensions (e.g., pose, lighting, gender cues) without adversarial regularization or disentanglement objectives. These results suggest that PCA is a viable replacement for VQ: mathematically grounded, stable, bit-efficient, and semantically structured, offering a new direction for generative models beyond vector quantization.
PCA-VAE:无需码书坍缩的可微分子空间量化 / PCA-VAE: Differentiable Subspace Quantization without Codebook Collapse
这篇论文提出了一种名为PCA-VAE的新模型,它用简单、可微分的在线主成分分析替代了传统向量量化,从而在图像重建质量更高、使用比特数更少的同时,避免了码书坍陷等问题,并能自动学习到具有可解释性的语义特征。
源自 arXiv: 2602.18904