快速矩阵乘法算法的神经学习:一种StrassenNet方法 / Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach
1️⃣ 一句话总结
这篇论文设计了一个名为StrassenNet的神经网络,它能够自动学习并重新发现经典的Strassen快速矩阵乘法算法,并通过实验证据推测了3x3矩阵乘法所需的最小计算复杂度。
Fast matrix multiplication can be described as searching for low-rank decompositions of the matrix--multiplication tensor. We design a neural architecture, \textsc{StrassenNet}, which reproduces the Strassen algorithm for $2\times 2$ multiplication. Across many independent runs the network always converges to a rank-$7$ tensor, thus numerically recovering Strassen's optimal algorithm. We then train the same architecture on $3\times 3$ multiplication with rank $r\in\{19,\dots,23\}$. Our experiments reveal a clear numerical threshold: models with $r=23$ attain significantly lower validation error than those with $r\le 22$, suggesting that $r=23$ could actually be the smallest effective rank of the matrix multiplication tensor $3\times 3$. We also sketch an extension of the method to border-rank decompositions via an $\varepsilon$--parametrisation and report preliminary results consistent with the known bounds for the border rank of the $3\times 3$ matrix--multiplication tensor.
快速矩阵乘法算法的神经学习:一种StrassenNet方法 / Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach
这篇论文设计了一个名为StrassenNet的神经网络,它能够自动学习并重新发现经典的Strassen快速矩阵乘法算法,并通过实验证据推测了3x3矩阵乘法所需的最小计算复杂度。
源自 arXiv: 2602.21797