使用高斯变分自编码器的向量量化 / Vector Quantization using Gaussian Variational Autoencoder
1️⃣ 一句话总结
这篇论文提出了一种名为‘高斯量化’的新方法,它能将训练好的高斯变分自编码器轻松转化为高质量的离散编码器,无需额外训练,从而在图像压缩任务中超越了多种现有技术。
Vector quantized variational autoencoder (VQ-VAE) is a discrete auto-encoder that compresses images into discrete tokens. It is difficult to train due to discretization. In this paper, we propose a simple yet effective technique, dubbed Gaussian Quant (GQ), that converts a Gaussian VAE with certain constraint into a VQ-VAE without training. GQ generates random Gaussian noise as a codebook and finds the closest noise to the posterior mean. Theoretically, we prove that when the logarithm of the codebook size exceeds the bits-back coding rate of the Gaussian VAE, a small quantization error is guaranteed. Practically, we propose a heuristic to train Gaussian VAE for effective GQ, named target divergence constraint (TDC). Empirically, we show that GQ outperforms previous VQ-VAEs, such as VQGAN, FSQ, LFQ, and BSQ, on both UNet and ViT architectures. Furthermore, TDC also improves upon previous Gaussian VAE discretization methods, such as TokenBridge. The source code is provided in this https URL.
使用高斯变分自编码器的向量量化 / Vector Quantization using Gaussian Variational Autoencoder
这篇论文提出了一种名为‘高斯量化’的新方法,它能将训练好的高斯变分自编码器轻松转化为高质量的离散编码器,无需额外训练,从而在图像压缩任务中超越了多种现有技术。
源自 arXiv: 2512.06609