在亚符号强化学习环境中落地LTL任务以实现零样本泛化 / Grounding LTL Tasks in Sub-Symbolic RL Environments for Zero-Shot Generalization
1️⃣ 一句话总结
这篇论文提出了一种新方法,能让强化学习智能体在没有预先定义符号对应关系的情况下,直接从原始视觉观察中学会理解和执行用逻辑语言描述的复杂时序任务,并实现对新任务的零样本泛化。
In this work we address the problem of training a Reinforcement Learning agent to follow multiple temporally-extended instructions expressed in Linear Temporal Logic in sub-symbolic environments. Previous multi-task work has mostly relied on knowledge of the mapping between raw observations and symbols appearing in the formulae. We drop this unrealistic assumption by jointly training a multi-task policy and a symbol grounder with the same experience. The symbol grounder is trained only from raw observations and sparse rewards via Neural Reward Machines in a semi-supervised fashion. Experiments on vision-based environments show that our method achieves performance comparable to using the true symbol grounding and significantly outperforms state-of-the-art methods for sub-symbolic environments.
在亚符号强化学习环境中落地LTL任务以实现零样本泛化 / Grounding LTL Tasks in Sub-Symbolic RL Environments for Zero-Shot Generalization
这篇论文提出了一种新方法,能让强化学习智能体在没有预先定义符号对应关系的情况下,直接从原始视觉观察中学会理解和执行用逻辑语言描述的复杂时序任务,并实现对新任务的零样本泛化。
源自 arXiv: 2602.09761