Learning What Matters Now: Dynamic Preference Inference under Contextual Shifts

📄 Abstract - Learning What Matters Now: Dynamic Preference Inference under Contextual Shifts

Humans often juggle multiple, sometimes conflicting objectives and shift their priorities as circumstances change, rather than following a fixed objective function. In contrast, most computational decision-making and multi-objective RL methods assume static preference weights or a known scalar reward. In this work, we study sequential decision-making problem when these preference weights are unobserved latent variables that drift with context. Specifically, we propose Dynamic Preference Inference (DPI), a cognitively inspired framework in which an agent maintains a probabilistic belief over preference weights, updates this belief from recent interaction, and conditions its policy on inferred preferences. We instantiate DPI as a variational preference inference module trained jointly with a preference-conditioned actor-critic, using vector-valued returns as evidence about latent trade-offs. In queueing, maze, and multi-objective continuous-control environments with event-driven changes in objectives, DPI adapts its inferred preferences to new regimes and achieves higher post-shift performance than fixed-weight and heuristic envelope baselines.

学习当下重要之事：情境变化下的动态偏好推断 / Learning What Matters Now: Dynamic Preference Inference under Contextual Shifts

1️⃣ 一句话总结

这篇论文提出了一种名为‘动态偏好推断’的新方法，让AI系统能够像人类一样，根据环境变化动态调整自己的目标优先级，从而在任务目标突然改变时表现得更好。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要