广义最优分类树:一种混合整数规划方法 / Generalized Optimal Classification Trees: A Mixed-Integer Programming Approach
1️⃣ 一句话总结
这篇论文提出了一种基于混合整数规划的通用框架,能够高效地构建最优分类决策树,特别针对F1分数等非线性评价指标和类别不平衡问题,并通过多种加速技术提升了模型在大规模数据集上的求解速度和预测性能。
Global optimization of decision trees is a long-standing challenge in combinatorial optimization, yet such models play an important role in interpretable machine learning. Although the problem has been investigated for several decades, only recent advances in discrete optimization have enabled practical algorithms for solving optimal classification tree problems on real-world datasets. Mixed-integer programming (MIP) offers a high degree of modeling flexibility, and we therefore propose a MIP-based framework for learning optimal classification trees under nonlinear performance metrics, such as the F1-score, that explicitly addresses class imbalance. To improve scalability, we develop problem-specific acceleration techniques, including a tailored branch-and-cut algorithm, an instance-reduction scheme, and warm-start strategies. We evaluate the proposed approach on 50 benchmark datasets. The computational results show that the framework can efficiently optimize nonlinear metrics while achieving strong predictive performance and reduced solution times compared with existing methods.
广义最优分类树:一种混合整数规划方法 / Generalized Optimal Classification Trees: A Mixed-Integer Programming Approach
这篇论文提出了一种基于混合整数规划的通用框架,能够高效地构建最优分类决策树,特别针对F1分数等非线性评价指标和类别不平衡问题,并通过多种加速技术提升了模型在大规模数据集上的求解速度和预测性能。
源自 arXiv: 2602.02173