← 返回列表

arXiv 提交日期: 2026-02-03

📄 Abstract - WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web Agents

Prompt injection attacks manipulate webpage content to cause web agents to execute attacker-specified tasks instead of the user's intended ones. Existing methods for detecting and localizing such attacks achieve limited effectiveness, as their underlying assumptions often do not hold in the web-agent setting. In this work, we propose WebSentinel, a two-step approach for detecting and localizing prompt injection attacks in webpages. Given a webpage, Step I extracts \emph{segments of interest} that may be contaminated, and Step II evaluates each segment by checking its consistency with the webpage content as context. We show that WebSentinel is highly effective, substantially outperforming baseline methods across multiple datasets of both contaminated and clean webpages that we collected. Our code is available at: this https URL.

顶级标签: llm agents systems

WebSentinel：针对网络代理的提示注入攻击检测与定位 / WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web Agents

1️⃣ 一句话总结

这篇论文提出了一种名为WebSentinel的两阶段方法，能有效检测并定位网页中旨在操控网络代理执行恶意任务的提示注入攻击，其性能显著优于现有基线方法。

👋 没兴趣 ☆ 感兴趣 📌 待读

打开原文 PDF

源自 arXiv: 2602.03792

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要