菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-25
📄 Abstract - Training Observable Control Policies to Expose Agent State Through Actions

Physical or operational constraints often impose communications limitations on autonomous agents. Such limitations complicate monitoring or multiagent coordination. Even when strong communications are absent, some information may still be available. The remainder of the relevant agent state may be reconstructed via estimation. The actions taken by an agent are a potential source of information -- as the agent interacts with the environment, these actions may be observed even in the absence of explicit communication. We investigate using actions to estimate the state of an agent, using reinforcement learning to develop policies which make the estimation problem more tractable. Policy observability is encouraged through the training reward and is analyzed using simulation of the trained agent. In an aircraft tracking problem a policy with enhanced observability is found that has minimal impact on nominal task performance.

顶级标签: reinforcement learning agents
详细标签: observability state estimation policy learning multiagent systems 或 搜索:

训练可观察控制策略:通过动作揭示智能体状态 / Training Observable Control Policies to Expose Agent State Through Actions


1️⃣ 一句话总结

本文提出一种强化学习方法,通过优化奖励函数来训练智能体,使其在执行任务时做出的动作能更清晰地暴露其内部状态,从而在缺乏直接通信的情况下帮助外部观察者或协同系统准确推断智能体的状态,且对原有任务表现影响很小。

源自 arXiv: 2606.27609