菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-16
📄 Abstract - Robust and Computationally Efficient Linear Contextual Bandits under Adversarial Corruption and Heavy-Tailed Noise

We study linear contextual bandits under adversarial corruption and heavy-tailed noise with finite $(1+\epsilon)$-th moments for some $\epsilon \in (0,1]$. Existing work that addresses both adversarial corruption and heavy-tailed noise relies on a finite variance (i.e., finite second-moment) assumption and suffers from computational inefficiency. We propose a computationally efficient algorithm based on online mirror descent that achieves robustness to both adversarial corruption and heavy-tailed noise. While the existing algorithm incurs $\mathcal{O}(t\log T)$ computational cost, our algorithm reduces this to $\mathcal{O}(1)$ per round. We establish an additive regret bound consisting of a term depending on the $(1+\epsilon)$-moment bound of the noise and a term depending on the total amount of corruption. In particular, when $\epsilon = 1$, our result recovers existing guarantees under finite-variance assumptions. When no corruption is present, it matches the best-known rates for linear contextual bandits with heavy-tailed noise. Moreover, the algorithm requires no prior knowledge of the noise moment bound or the total amount of corruption and still guarantees sublinear regret.

顶级标签: theory machine learning reinforcement learning
详细标签: contextual bandits adversarial robustness heavy-tailed noise online mirror descent regret analysis 或 搜索:

在对抗性数据污染和重尾噪声下实现鲁棒且计算高效的线性上下文赌博机算法 / Robust and Computationally Efficient Linear Contextual Bandits under Adversarial Corruption and Heavy-Tailed Noise


1️⃣ 一句话总结

本文提出了一种基于在线镜像下降的新型算法,能够在数据被恶意篡改且观测噪声分布极端(重尾)的复杂环境下,高效地学习并做出决策,其计算成本远低于现有方法,且无需预先知道噪声和污染的具体程度。

源自 arXiv: 2603.15596