基于重要性平滑的深度状态空间模型高效学习 / Efficient Learning of Deep State Space Models via Importance Smoothing
1️⃣ 一句话总结
本文提出了一种名为并行变分蒙特卡洛的新训练方法,将两种主流深度状态空间模型训练策略(变分自编码与序列蒙特卡洛)的优势结合,在保持模型性能的同时,将训练速度提升至最快的序列蒙特卡洛方法的10倍,且适用于判别式和生成式任务。
Latent state space systems are ubiquitous in statistical modelling, arising naturally when a time series is observed through a noisy measurement function, however training deep state space models (DSSM) at scale remains difficult. Two largely distinct strategies and literatures have developed around the training of DSSMs. Firstly, auto-encoding DSSMs train generative DSSMs by optimising a variational lower bound. Secondly, DSSMs trained by back-propagating the outputs of a classical sequential Monte Carlo algorithm (SMC). Such approaches can train DSSMs for discriminative as well as generative tasks, however, due to the sequentiality of their forward pass, scale poorly on modern hardware. We propose a new training method \emph{parallel variational Monte Carlo} (PVMC) that bridges the gap between the paradigms, and can be used robustly to train DSSMs for both discriminative and generative tasks. Our method achieves state-of-the-art or better results on a set of baseline experiments and trains $10\times$ faster than the fastest competing SMC approach.
基于重要性平滑的深度状态空间模型高效学习 / Efficient Learning of Deep State Space Models via Importance Smoothing
本文提出了一种名为并行变分蒙特卡洛的新训练方法,将两种主流深度状态空间模型训练策略(变分自编码与序列蒙特卡洛)的优势结合,在保持模型性能的同时,将训练速度提升至最快的序列蒙特卡洛方法的10倍,且适用于判别式和生成式任务。
源自 arXiv: 2605.21108