菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-17
📄 Abstract - Multi-Modal Hyper-Graph Fusion for Low-Light Crowd Counting

Crowd counting is a fundamental task in computer vision. However, crowd counting in low-light environments remains largely underexplored, despite its practical importance in the real world. Existing methods mainly focus on well-lit scenes or rely on single-modality Red-Green-Blue (RGB) representations, which often become unreliable under extreme darkness and complex non-uniform illumination. To handle this problem, we construct three new low-light crowd counting benchmarks, which consist of two synthetic datasets, SHA\_Dark and SHB\_Dark, and a real-world benchmark LC-Crowd (Low-light Crowd Dataset). Inspired by Retinex-based physical modeling, we introduce depth and Canny edge cues as complementary geometric and structural priors to enhance the intrinsic reflectance representation under low-light conditions. We propose a Multi-Modal Hyper-Graph Fusion module, which formulates RGB appearance, depth geometry, and edge structure cues as nodes in a unified hyper-graph and explicitly captures their high-order complementary relationships via dynamic hyperedge construction and message passing. Furthermore, to adaptively allocate computation in dense prediction, we propose a Deformable Rectangular Sparse Attention (DRSA) module, which concentrates computation on informative regions through anchor-aware estimation and adaptive rectangular window modeling. Based on these designs, we develop a unified Low-Light Counting Network (LCNet) for robust low-light crowd counting. Extensive experiments on three benchmarks demonstrate that the proposed method achieves the best overall performance against existing state-of-the-art (SOTA) methods. The code is in the supplementary material. The datasets will be made public upon acceptance.

顶级标签: computer vision data benchmark
详细标签: crowd counting low-light multi-modal fusion hyper-graph attention 或 搜索:

多模态超图融合的低光照人群计数方法 / Multi-Modal Hyper-Graph Fusion for Low-Light Crowd Counting


1️⃣ 一句话总结

该论文针对低光照环境下人群计数困难的问题,构建了三个新的低光照数据集,并提出了一种融合RGB图像、深度信息和边缘结构的多模态超图网络,通过动态超边构建和自适应稀疏注意力机制,显著提升了极暗和复杂光照条件下的计数准确性。

源自 arXiv: 2606.18566