通过超单纯形投影实现可微的零一损失函数 / Differentiable Zero-One Loss via Hypersimplex Projections
1️⃣ 一句话总结
这篇论文提出了一种名为Soft-Binary-Argmax的新方法,通过超单纯形投影将原本不可微的零一损失函数变得可微,从而让机器学习模型在训练时能直接优化分类准确率,并在大批量训练中显著提升了模型的泛化能力。
Recent advances in machine learning have emphasized the integration of structured optimization components into end-to-end differentiable models, enabling richer inductive biases and tighter alignment with task-specific objectives. In this work, we introduce a novel differentiable approximation to the zero-one loss-long considered the gold standard for classification performance, yet incompatible with gradient-based optimization due to its non-differentiability. Our method constructs a smooth, order-preserving projection onto the n,k-dimensional hypersimplex through a constrained optimization framework, leading to a new operator we term Soft-Binary-Argmax. After deriving its mathematical properties, we show how its Jacobian can be efficiently computed and integrated into binary and multiclass learning systems. Empirically, our approach achieves significant improvements in generalization under large-batch training by imposing geometric consistency constraints on the output logits, thereby narrowing the performance gap traditionally observed in large-batch training.
通过超单纯形投影实现可微的零一损失函数 / Differentiable Zero-One Loss via Hypersimplex Projections
这篇论文提出了一种名为Soft-Binary-Argmax的新方法,通过超单纯形投影将原本不可微的零一损失函数变得可微,从而让机器学习模型在训练时能直接优化分类准确率,并在大批量训练中显著提升了模型的泛化能力。
源自 arXiv: 2602.23336