超越学生:一种用于神经网络继承的非对称网络 / Beyond Student: An Asymmetric Network for Neural Network Inheritance
1️⃣ 一句话总结
这篇论文提出了一种名为InherNet的新方法,它通过非对称低秩分解直接继承大型教师网络的结构和核心知识,从而构建出比传统知识蒸馏方法性能更强、参数更少的轻量级网络。
Knowledge Distillation (KD) has emerged as a powerful technique for model compression, enabling lightweight student networks to benefit from the performance of redundant teacher networks. However, the inherent capacity gap often limits the performance of student networks. Inspired by the expressiveness of pretrained teacher networks, a compelling research question arises: is there a type of network that can not only inherit the teacher's structure but also maximize the inheritance of its knowledge? Furthermore, how does the performance of such an inheriting network compare to that of student networks, all benefiting from the same teacher network? To further explore this question, we propose InherNet, a neural network inheritance method that performs asymmetric low-rank decomposition on the teacher's weights and reconstructs a lightweight yet expressive network without significant architectural disruption. By leveraging Singular Value Decomposition (SVD) for initialization to ensure the inheritance of principal knowledge, InherNet effectively balances depth, width, and compression efficiency. Experimental results across unimodal and multimodal tasks demonstrate that InherNet achieves higher performance compared to student networks of similar parameter sizes. Our findings reveal a promising direction for future research in efficient model compression beyond traditional distillation.
超越学生:一种用于神经网络继承的非对称网络 / Beyond Student: An Asymmetric Network for Neural Network Inheritance
这篇论文提出了一种名为InherNet的新方法,它通过非对称低秩分解直接继承大型教师网络的结构和核心知识,从而构建出比传统知识蒸馏方法性能更强、参数更少的轻量级网络。
源自 arXiv: 2602.09509