菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-16
📄 Abstract - Point-Identification of a Robust Predictor Under Latent Shift with Imperfect Proxies

Addressing the domain adaptation problem becomes more challenging when distribution shifts across domains stem from latent confounders that affect both covariates and outcomes. Existing proxy-based approaches that address latent shift rely on a strong completeness assumption to uniquely determine (point-identify) a robust predictor. Completeness requires that proxies have sufficient information about variations in latent confounders. For imperfect proxies the mapping from confounders to the space of proxy distributions is non-injective, and multiple latent confounder values can generate the same proxy distribution. This breaks the completeness assumption and observed data are consistent with multiple potential predictors (set-identified). To address this, we introduce latent equivalent classes (LECs). LECs are defined as groups of latent confounders that induce the same conditional proxy distribution. We show that point-identification for the robust predictor remains achievable as long as multiple domains differ sufficiently in how they mix proxy-induced LECs to form the robust predictor. This domain diversity condition is formalized as a cross-domain rank condition on the mixture weights, which is substantially weaker assumption than completeness. We introduce the Proximal Quasi-Bayesian Active learning (PQAL) framework, which actively queries a minimal set of diverse domains that satisfy this rank condition. PQAL can efficiently recover the point-identified predictor, demonstrates robustness to varying degrees of shift and outperforms previous methods on synthetic data and semi-synthetic dSprites dataset.

顶级标签: theory machine learning model evaluation
详细标签: domain adaptation latent shift causal inference point identification proxy variables 或 搜索:

基于不完美代理变量的潜在分布偏移下鲁棒预测器的点识别 / Point-Identification of a Robust Predictor Under Latent Shift with Imperfect Proxies


1️⃣ 一句话总结

这篇论文提出了一种新方法,在存在潜在混淆变量且代理变量不完美的情况下,通过利用多个领域数据中代理变量诱导的潜在等价类混合方式的差异,成功实现了鲁棒预测器的唯一确定(点识别),并开发了一个主动学习框架来高效地实现这一目标。

源自 arXiv: 2603.15158