菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-19
📄 Abstract - Cross-Paradigm Knowledge Distillation: A Comprehensive Study of Bidirectional Transfer Between Random Forests and Deep Neural Networks for Big Data Applications

The exponential growth of big data has intensified the need for efficient and interpretable machine learning models that can handle diverse data characteristics while maintaining computational efficiency. Knowledge distillation has primarily focused on neural network-to-neural network transfer, leaving cross-paradigm knowledge transfer largely unexplored. This paper presents the first comprehensive study of bidirectional knowledge distillation between Random Forests (RF) and Deep Neural Networks (DNN), addressing critical gaps in ensemble learning and model compression for big data applications. We propose novel methodologies including progressive multi-stage distillation, multi-teacher ensemble distillation from diverse tree models, and uncertainty-aware cross-paradigm transfer mechanisms. Through 144 comprehensive experiments across 6 diverse datasets encompassing classification and regression tasks, we demonstrate that bidirectional RF-DL distillation achieves competitive performance while providing complementary benefits: interpretability from tree models and expressiveness from neural networks. Our results show that multi-teacher ensemble distillation consistently outperforms traditional approaches, with NN-COMPACT achieving 98.13% classification accuracy and NN-WIDE reaching 92.6% R^2 score in regression tasks. The proposed framework enables deployment flexibility in big data environments, allowing optimal model selection based on computational constraints and interpretability requirements. This work establishes a new research direction in cross-paradigm knowledge transfer with significant implications for interpretable AI and scalable model deployment in resource-constrained big data systems.

顶级标签: machine learning systems model training
详细标签: knowledge distillation random forests deep neural networks big data cross-paradigm 或 搜索:

跨范式知识蒸馏:随机森林与深度神经网络在大数据应用中的双向迁移综合研究 / Cross-Paradigm Knowledge Distillation: A Comprehensive Study of Bidirectional Transfer Between Random Forests and Deep Neural Networks for Big Data Applications


1️⃣ 一句话总结

本文首次系统研究了随机森林与深度神经网络之间的双向知识蒸馏,通过多阶段蒸馏、多教师集成和不确定性感知等新方法,在分类和回归任务上取得了优异性能,兼具随机森林的可解释性和神经网络的表达能力,为大数据环境下的模型压缩与灵活部署提供了新方向。

源自 arXiv: 2605.19299