菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-04
📄 Abstract - Efficient Point Cloud Processing with High-Dimensional Positional Encoding and Non-Local MLPs

Multi-Layer Perceptron (MLP) models are the foundation of contemporary point cloud processing. However, their complex network architectures obscure the source of their strength and limit the application of these models. In this article, we develop a two-stage abstraction and refinement (ABS-REF) view for modular feature extraction in point cloud processing. This view elucidates that whereas the early models focused on ABS stages, the more recent techniques devise sophisticated REF stages to attain performance advantages. Then, we propose a High-dimensional Positional Encoding (HPE) module to explicitly utilize intrinsic positional information, extending the ``positional encoding'' concept from Transformer literature. HPE can be readily deployed in MLP-based architectures and is compatible with transformer-based methods. Within our ABS-REF view, we rethink local aggregation in MLP-based methods and propose replacing time-consuming local MLP operations, which are used to capture local relationships among neighbors. Instead, we use non-local MLPs for efficient non-local information updates, combined with the proposed HPE for effective local information representation. We leverage our modules to develop HPENets, a suite of MLP networks that follow the ABS-REF paradigm, incorporating a scalable HPE-based REF stage. Extensive experiments on seven public datasets across four different tasks show that HPENets deliver a strong balance between efficiency and effectiveness. Notably, HPENet surpasses PointNeXt, a strong MLP-based counterpart, by 1.1% mAcc, 4.0% mIoU, 1.8% mIoU, and 0.2% Cls. mIoU, with only 50.0%, 21.5%, 23.1%, 44.4% of FLOPs on ScanObjectNN, S3DIS, ScanNet, and ShapeNetPart, respectively. Source code is available at this https URL.

顶级标签: computer vision model training machine learning
详细标签: point cloud processing positional encoding mlp networks non-local operations 3d vision 或 搜索:

利用高维位置编码与非局部多层感知机进行高效点云处理 / Efficient Point Cloud Processing with High-Dimensional Positional Encoding and Non-Local MLPs


1️⃣ 一句话总结

这篇论文提出了一种新的视角和两个核心模块,通过高维位置编码来显式利用点云的位置信息,并用高效的非局部多层感知机替代复杂的局部操作,从而构建出在多个任务上性能更强、计算量更少的点云处理网络。

源自 arXiv: 2603.04099