菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-28
📄 Abstract - Wave-Attractor-Tree: A Hierarchical Binary Tree Reduction Architecture for Efficient Sequence Modeling

Work introduces a hierarchical binary tree-based reduction that replaces standard self-attention. The core idea is to use a recursive Gated Linear Unit merge operation, achieving O(n) total merge operations O(log n) parallel depth O(n d^2) total work and O(n) space complexity. In these experiments, the model significantly outperforms standard Transformers in both convergence speed and accuracy on long-range structural dependencies, specifically where hierarchical inductive bias is critical.

顶级标签: natural language processing model training theory
详细标签: sequence modeling efficient attention hierarchical architecture binary tree linear complexity 或 搜索:

波吸引子树:一种用于高效序列建模的分层二叉树归约架构 / Wave-Attractor-Tree: A Hierarchical Binary Tree Reduction Architecture for Efficient Sequence Modeling


1️⃣ 一句话总结

这篇论文提出了一种名为“波吸引子树”的新模型架构,它用分层的二叉树结构代替了传统Transformer的自注意力机制,通过递归合并操作,在保持高性能的同时,显著降低了计算复杂度和内存消耗,特别擅长处理需要理解层次化结构的复杂序列数据。

源自 arXiv: 2603.00812