菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-23
📄 Abstract - Learning Physical Principles from Interaction: Self-Evolving Planning via Test-Time Memory

Reliable object manipulation requires understanding physical properties that vary across objects and environments. Vision-language model (VLM) planners can reason about friction and stability in general terms; however, they often cannot predict how a specific ball will roll on a particular surface or which stone will provide a stable foundation without direct experience. We present PhysMem, a memory framework that enables VLM robot planners to learn physical principles from interaction at test time, without updating model parameters. The system records experiences, generates candidate hypotheses, and verifies them through targeted interaction before promoting validated knowledge to guide future decisions. A central design choice is verification before application: the system tests hypotheses against new observations rather than applying retrieved experience directly, reducing rigid reliance on prior experience when physical conditions change. We evaluate PhysMem on three real-world manipulation tasks and simulation benchmarks across four VLM backbones. On a controlled brick insertion task, principled abstraction achieves 76% success compared to 23% for direct experience retrieval, and real-world experiments show consistent improvement over 30-minute deployment sessions.

顶级标签: robotics agents model evaluation
详细标签: physical reasoning test-time learning memory framework vision-language models robot manipulation 或 搜索:

通过交互学习物理原理:基于测试时记忆的自进化规划 / Learning Physical Principles from Interaction: Self-Evolving Planning via Test-Time Memory


1️⃣ 一句话总结

这篇论文提出了一个名为PhysMem的记忆框架,它能让机器人像人一样,在实际操作中通过‘动手试错’来学习具体的物理规律(比如不同球的滚动方式或石头的稳定性),并将验证过的经验转化为知识库,从而在环境变化时做出更灵活、更可靠的决策,而不是死记硬背过去的经验。

源自 arXiv: 2602.20323