扩展行为克隆提升因果推理:一个用于实时电子游戏游玩的开源模型 / Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing
1️⃣ 一句话总结
这篇论文通过开源大量游戏数据和模型,证明了在行为克隆任务中,同时扩大模型规模和训练数据量不仅能提升模型玩3D游戏的水平,还能使其学会更具因果推理能力的策略。
Behavior cloning is enjoying a resurgence in popularity as scaling both model and data sizes proves to provide a strong starting point for many tasks of interest. In this work, we introduce an open recipe for training a video game playing foundation model designed for inference in realtime on a consumer GPU. We release all data (8300+ hours of high quality human gameplay), training and inference code, and pretrained checkpoints under an open license. We show that our best model is capable of playing a variety of 3D video games at a level competitive with human play. We use this recipe to systematically examine the scaling laws of behavior cloning to understand how the model's performance and causal reasoning varies with model and data scale. We first show in a simple toy problem that, for some types of causal reasoning, increasing both the amount of training data and the depth of the network results in the model learning a more causal policy. We then systematically study how causality varies with the number of parameters (and depth) and training steps in scaled models of up to 1.2 billion parameters, and we find similar scaling results to what we observe in the toy problem.
扩展行为克隆提升因果推理:一个用于实时电子游戏游玩的开源模型 / Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing
这篇论文通过开源大量游戏数据和模型,证明了在行为克隆任务中,同时扩大模型规模和训练数据量不仅能提升模型玩3D游戏的水平,还能使其学会更具因果推理能力的策略。
源自 arXiv: 2601.04575