菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-30
📄 Abstract - Beyond Scanpaths: Graph-Based Gaze Simulation in Dynamic Scenes

Accurately modelling human attention is essential for numerous computer vision applications, particularly in the domain of automotive safety. Existing methods typically collapse gaze into saliency maps or scanpaths, treating gaze dynamics only implicitly. We instead formulate gaze modelling as an autoregressive dynamical system and explicitly unroll raw gaze trajectories over time, conditioned on both gaze history and the evolving environment. Driving scenes are represented as gaze-centric graphs processed by the Affinity Relation Transformer (ART), a heterogeneous graph transformer that models interactions between driver gaze, traffic objects, and road structure. We further introduce the Object Density Network (ODN) to predict next-step gaze distributions, capturing the stochastic and object-centric nature of attentional shifts in complex environments. We also release Focus100, a new dataset of raw gaze data from 30 participants viewing egocentric driving footage. Trained directly on raw gaze, without fixation filtering, our unified approach produces more natural gaze trajectories, scanpath dynamics, and saliency maps than existing attention models, offering valuable insights for the temporal modelling of human attention in dynamic environments.

顶级标签: computer vision model training model evaluation
详细标签: gaze simulation graph transformer attention modeling driving dataset dynamic scenes 或 搜索:

超越扫视路径:动态场景中基于图模型的注视模拟 / Beyond Scanpaths: Graph-Based Gaze Simulation in Dynamic Scenes


1️⃣ 一句话总结

这篇论文提出了一种新的注视建模方法,它将驾驶员在动态驾驶场景中的视线轨迹视为一个受历史和环境影响的动态系统,通过图神经网络和物体密度网络来预测更自然的视线移动,并发布了一个新的原始注视数据集。

源自 arXiv: 2603.28319