Mutual Information Collapse Explains Disentanglement Failure in $β$-VAEs

📄 Abstract - Mutual Information Collapse Explains Disentanglement Failure in $β$-VAEs

The $\beta$-VAE is a foundational framework for unsupervised disentanglement, using $\beta$ to regulate the trade-off between latent factorization and reconstruction fidelity. Empirically, however, disentanglement performance exhibits a pervasive non-monotonic trend: benchmarks such as MIG and SAP typically peak at intermediate $\beta$ and collapse as regularization increases. We demonstrate that this collapse is a fundamental information-theoretic failure, where strong Kullback-Leibler pressure promotes marginal independence at the expense of the latent channel's semantic informativeness. By formalizing this mechanism in a linear-Gaussian setting, we prove that for $\beta > 1$, stationarity-induced dynamics trigger a spectral contraction of the encoder gain, driving latent-factor mutual information to zero. To resolve this, we introduce the $\lambda\beta$-VAE, which decouples regularization pressure from informational collapse via an auxiliary $L_2$ reconstruction penalty $\lambda$. Extensive experiments on dSprites, Shapes3D, and MPI3D-real confirm that $\lambda > 0$ stabilizes disentanglement and restores latent informativeness over a significantly broader range of $\beta$, providing a principled theoretical justification for dual-parameter regularization in variational inference backbones.

互信息坍缩解释β-VAE解耦失败的原因 / Mutual Information Collapse Explains Disentanglement Failure in $β$-VAEs

1️⃣ 一句话总结

这篇论文发现，在经典的β-VAE模型中，过强的正则化会导致潜在特征之间的互信息归零，从而破坏解耦效果，并提出了一种新的双参数模型来稳定地分离和控制这两个目标。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要