Differentiable Zero-One Loss via Hypersimplex Projections

📄 Abstract - Differentiable Zero-One Loss via Hypersimplex Projections

Recent advances in machine learning have emphasized the integration of structured optimization components into end-to-end differentiable models, enabling richer inductive biases and tighter alignment with task-specific objectives. In this work, we introduce a novel differentiable approximation to the zero-one loss-long considered the gold standard for classification performance, yet incompatible with gradient-based optimization due to its non-differentiability. Our method constructs a smooth, order-preserving projection onto the n,k-dimensional hypersimplex through a constrained optimization framework, leading to a new operator we term Soft-Binary-Argmax. After deriving its mathematical properties, we show how its Jacobian can be efficiently computed and integrated into binary and multiclass learning systems. Empirically, our approach achieves significant improvements in generalization under large-batch training by imposing geometric consistency constraints on the output logits, thereby narrowing the performance gap traditionally observed in large-batch training.

通过超单纯形投影实现可微的零一损失函数 / Differentiable Zero-One Loss via Hypersimplex Projections

1️⃣ 一句话总结

这篇论文提出了一种名为Soft-Binary-Argmax的新方法，通过超单纯形投影将原本不可微的零一损失函数变得可微，从而让机器学习模型在训练时能直接优化分类准确率，并在大批量训练中显著提升了模型的泛化能力。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要