通过自我对弈发现格基约简策略 / Discovering Lattice Reduction Strategies via Self-Play
1️⃣ 一句话总结
本工作将格基约简问题建模为单人马尔可夫决策过程,并利用类似AlphaZero的深度强化学习方法,训练出一个名为DeltaStar的神经网络策略;该策略仅在8维小规模格上训练,就能在无需重新训练的情况下,零样本推广到更高维度的格,并比经典的LLL算法使用更少的操作步骤。
The Lenstra-Lenstra-Lovász (LLL) algorithm is a seminal contribution to computer science used for lattice basis reduction, yet its polynomial-time outputs produce bases that are far from optimal as the dimension grows. We show that deep reinforcement learning can discover strictly superior, generalizable reduction strategies by interacting with the primitive action space of LLL. We formulate lattice reduction as a single-player Markov Decision Process (MDP) and train a deep residual network using an AlphaZero-style self-play pipeline augmented with adaptive-horizon MCTS (Monte Carlo Tree Search), which couples multi-step network predictions with an entropy-gated expansion mechanism. The resulting policy, DeltaStar, is trained exclusively on small $8$-dimensional $q$-ary lattices and requires fewer primitive row operations than LLL. Crucially, it generalizes zero-shot to unseen moduli and higher dimensions up to $n=32$ without retraining.
通过自我对弈发现格基约简策略 / Discovering Lattice Reduction Strategies via Self-Play
本工作将格基约简问题建模为单人马尔可夫决策过程,并利用类似AlphaZero的深度强化学习方法,训练出一个名为DeltaStar的神经网络策略;该策略仅在8维小规模格上训练,就能在无需重新训练的情况下,零样本推广到更高维度的格,并比经典的LLL算法使用更少的操作步骤。
源自 arXiv: 2606.15301