菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-09
📄 Abstract - PRISM-XR: Empowering Privacy-Aware XR Collaboration with Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) enhance collaboration in Extended Reality (XR) environments by enabling flexible object and animation creation through the combination of natural language and visual inputs. However, visual data captured by XR headsets includes real-world backgrounds that may contain irrelevant or sensitive user information, such as credit cards left on the table or facial identities of other users. Uploading those frames to cloud-based MLLMs poses serious privacy risks, particularly when such data is processed without explicit user consent. Additionally, existing colocation and synchronization mechanisms in commercial XR APIs rely on time-consuming, privacy-invasive environment scanning and struggle to adapt to the highly dynamic nature of MLLM-integrated XR environments. In this paper, we propose PRISM-XR, a novel framework that facilitates multi-user collaboration in XR by providing privacy-aware MLLM integration. PRISM-XR employs intelligent frame preprocessing on the edge server to filter sensitive data and remove irrelevant context before communicating with cloud generative AI models. Additionally, we introduce a lightweight registration process and a fully customizable content-sharing mechanism to enable efficient, accurate, and privacy-preserving content synchronization among users. Our numerical evaluation results indicate that the proposed platform achieves nearly 90% accuracy in fulfilling user requests and less than 0.27 seconds registration time while maintaining spatial inconsistencies of less than 3.5 cm. Furthermore, we conducted an IRB-approved user study with 28 participants, demonstrating that our system could automatically filter highly sensitive objects in over 90% of scenarios while maintaining strong overall usability.

顶级标签: multi-modal systems agents
详细标签: extended reality privacy preservation edge computing collaboration multimodal llms 或 搜索:

PRISM-XR:利用多模态大语言模型赋能隐私感知的XR协作 / PRISM-XR: Empowering Privacy-Aware XR Collaboration with Multimodal Large Language Models


1️⃣ 一句话总结

这篇论文提出了一个名为PRISM-XR的新框架,它通过在边缘服务器上智能过滤敏感信息,并设计轻量级的同步机制,解决了扩展现实环境中使用多模态大语言模型进行协作时面临的严重隐私泄露和动态适应难题。

源自 arXiv: 2602.10154