菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-13
📄 Abstract - Bridging Domain Gaps with Target-Aligned Generation for Offline Reinforcement Learning

Cross-domain offline reinforcement learning aims to adapt a policy from a source domain to a target domain using only pre-collected datasets, where environment dynamics may differ. A key challenge is to leverage source data while reducing distributional mismatch, particularly when the target dataset is extremely limited. To address this, we propose Target-aligned Coverage Expansion (TCE), a framework that decides how source data should be used, either by directly incorporating target-near transitions or by expanding state coverage through target-aligned generation, guided by theoretical analysis. TCE builds on a dual score-based generative model to synthesize target-consistent transitions over an expanded state region. Extensive experiments across diverse cross-domain environments show that TCE consistently outperforms state-of-the-art cross-domain offline RL baselines.

顶级标签: reinforcement learning machine learning
详细标签: offline reinforcement learning cross-domain adaptation generative model distributional mismatch coverage expansion 或 搜索:

弥合领域差距:面向离线强化学习的对齐目标生成方法 / Bridging Domain Gaps with Target-Aligned Generation for Offline Reinforcement Learning


1️⃣ 一句话总结

本论文提出了一种名为TCE的框架,通过理论指导下的目标对齐生成技术,在目标域数据极其有限的情况下,智能地利用源域数据来扩展状态覆盖范围,从而有效解决了跨领域离线强化学习中因环境差异导致的策略适配难题。

源自 arXiv: 2605.13054