菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-25
📄 Abstract - Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach

Fast matrix multiplication can be described as searching for low-rank decompositions of the matrix--multiplication tensor. We design a neural architecture, \textsc{StrassenNet}, which reproduces the Strassen algorithm for $2\times 2$ multiplication. Across many independent runs the network always converges to a rank-$7$ tensor, thus numerically recovering Strassen's optimal algorithm. We then train the same architecture on $3\times 3$ multiplication with rank $r\in\{19,\dots,23\}$. Our experiments reveal a clear numerical threshold: models with $r=23$ attain significantly lower validation error than those with $r\le 22$, suggesting that $r=23$ could actually be the smallest effective rank of the matrix multiplication tensor $3\times 3$. We also sketch an extension of the method to border-rank decompositions via an $\varepsilon$--parametrisation and report preliminary results consistent with the known bounds for the border rank of the $3\times 3$ matrix--multiplication tensor.

顶级标签: machine learning model training theory
详细标签: matrix multiplication neural architecture tensor decomposition strassen algorithm border rank 或 搜索:

快速矩阵乘法算法的神经学习:一种StrassenNet方法 / Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach


1️⃣ 一句话总结

这篇论文设计了一个名为StrassenNet的神经网络,它能够自动学习并重新发现经典的Strassen快速矩阵乘法算法,并通过实验证据推测了3x3矩阵乘法所需的最小计算复杂度。

源自 arXiv: 2602.21797