菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-21
📄 Abstract - PCA-VAE: Differentiable Subspace Quantization without Codebook Collapse

Vector-quantized autoencoders deliver high-fidelity latents but suffer inherent flaws: the quantizer is non-differentiable, requires straight-through hacks, and is prone to collapse. We address these issues at the root by replacing VQ with a simple, principled, and fully differentiable alternative: an online PCA bottleneck trained via Oja's rule. The resulting model, PCA-VAE, learns an orthogonal, variance-ordered latent basis without codebooks, commitment losses, or lookup noise. Despite its simplicity, PCA-VAE exceeds VQ-GAN and SimVQ in reconstruction quality on CelebAHQ while using 10-100x fewer latent bits. It also produces naturally interpretable dimensions (e.g., pose, lighting, gender cues) without adversarial regularization or disentanglement objectives. These results suggest that PCA is a viable replacement for VQ: mathematically grounded, stable, bit-efficient, and semantically structured, offering a new direction for generative models beyond vector quantization.

顶级标签: model training machine learning theory
详细标签: vector quantization autoencoders differentiable pca generative models latent representation 或 搜索:

PCA-VAE:无需码书坍缩的可微分子空间量化 / PCA-VAE: Differentiable Subspace Quantization without Codebook Collapse


1️⃣ 一句话总结

这篇论文提出了一种名为PCA-VAE的新模型,它用简单、可微分的在线主成分分析替代了传统向量量化,从而在图像重建质量更高、使用比特数更少的同时,避免了码书坍陷等问题,并能自动学习到具有可解释性的语义特征。

源自 arXiv: 2602.18904