Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach

📄 Abstract - Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach

Fast matrix multiplication can be described as searching for low-rank decompositions of the matrix--multiplication tensor. We design a neural architecture, \textsc{StrassenNet}, which reproduces the Strassen algorithm for $2\times 2$ multiplication. Across many independent runs the network always converges to a rank-$7$ tensor, thus numerically recovering Strassen's optimal algorithm. We then train the same architecture on $3\times 3$ multiplication with rank $r\in\{19,\dots,23\}$. Our experiments reveal a clear numerical threshold: models with $r=23$ attain significantly lower validation error than those with $r\le 22$, suggesting that $r=23$ could actually be the smallest effective rank of the matrix multiplication tensor $3\times 3$. We also sketch an extension of the method to border-rank decompositions via an $\varepsilon$--parametrisation and report preliminary results consistent with the known bounds for the border rank of the $3\times 3$ matrix--multiplication tensor.

快速矩阵乘法算法的神经学习：一种StrassenNet方法 / Neural Learning of Fast Matrix Multiplication Algorithms: A StrassenNet Approach

1️⃣ 一句话总结

这篇论文设计了一个名为StrassenNet的神经网络，它能够自动学习并重新发现经典的Strassen快速矩阵乘法算法，并通过实验证据推测了3x3矩阵乘法所需的最小计算复杂度。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要