菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-17
📄 Abstract - Target-confidence Recourse Using tSeTlin machines: TRUST

Counterfactual explanations are widely used to provide algorithmic recourse in high-stakes decision-making systems. Most existing methods seek the smallest change to an input that flips a model's decision. However, decision-makers often rely not only on predicted labels but also on confidence thresholds and risk margins. Counterfactuals that barely cross a decision boundary can be fragile and unstable under noise or model variation. In this paper, we propose Target-confidence Recourse Using tSeTlin machines (TRUST), a framework in which users explicitly specify the desired prediction confidence for recourse. Rather than generating counterfactuals and evaluating confidence afterward, TRUST directly searches for minimal changes that satisfy a user-defined confidence target, enabling comparison of recourse options in terms of cost, confidence, and robustness. We instantiate TRUST using a Probabilistic Tsetlin Machine (PTM) combined with Bayesian optimization. The probabilistic clause-based structure of PTM links prediction confidence to the stability of decision rules. We show that counterfactuals satisfying the same rules can still differ substantially in reliability depending on how securely they satisfy those rules, revealing whether decisions are supported by robust or fragile clause activations. Experiments on synthetic and real-world datasets demonstrate that target-confidence counterfactuals produce more robust and interpretable recourse than conventional boundary-based approaches. Across multiple benchmarks, TRUST achieves perfect robustness while maintaining low recourse cost, including an L2 distance of 0.10 on the Haberman dataset at 0.92 confidence. By explicitly controlling confidence and exposing rule-level stability, TRUST provides actionable recourse for high-stakes decision support.

顶级标签: machine learning model evaluation
详细标签: counterfactual explanations recourse confidence tsetlin machines robustness 或 搜索:

基于Tsetlin机器的目标置信度决策追索方法:TRUST / Target-confidence Recourse Using tSeTlin machines: TRUST


1️⃣ 一句话总结

本文提出一种名为TRUST的新方法,允许用户事先指定期望的决策置信度,通过结合概率Tsetlin机器和贝叶斯优化,直接生成满足该置信度且改动最小的可行方案,从而避免传统反事实解释因仅关注决策边界而产生的脆弱性问题,在多个数据集上实现了高鲁棒性和低代价追索。

源自 arXiv: 2606.18832