FinVault:在执行落地环境中对金融智能体进行安全基准测试 / FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments
1️⃣ 一句话总结
这篇论文提出了首个针对金融智能体的执行落地安全测试基准FinVault,通过模拟真实金融操作场景和漏洞测试,发现现有主流AI模型的安全防护措施在应对金融场景下的攻击时效果有限,凸显了开发更强金融专用防御方案的必要性。
Financial agents powered by large language models (LLMs) are increasingly deployed for investment analysis, risk assessment, and automated decision-making, where their abilities to plan, invoke tools, and manipulate mutable state introduce new security risks in high-stakes and highly regulated financial environments. However, existing safety evaluations largely focus on language-model-level content compliance or abstract agent settings, failing to capture execution-grounded risks arising from real operational workflows and state-changing actions. To bridge this gap, we propose FinVault, the first execution-grounded security benchmark for financial agents, comprising 31 regulatory case-driven sandbox scenarios with state-writable databases and explicit compliance constraints, together with 107 real-world vulnerabilities and 963 test cases that systematically cover prompt injection, jailbreaking, financially adapted attacks, as well as benign inputs for false-positive evaluation. Experimental results reveal that existing defense mechanisms remain ineffective in realistic financial agent settings, with average attack success rates (ASR) still reaching up to 50.0\% on state-of-the-art models and remaining non-negligible even for the most robust systems (ASR 6.7\%), highlighting the limited transferability of current safety designs and the need for stronger financial-specific defenses. Our code can be found at this https URL.
FinVault:在执行落地环境中对金融智能体进行安全基准测试 / FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments
这篇论文提出了首个针对金融智能体的执行落地安全测试基准FinVault,通过模拟真实金融操作场景和漏洞测试,发现现有主流AI模型的安全防护措施在应对金融场景下的攻击时效果有限,凸显了开发更强金融专用防御方案的必要性。
源自 arXiv: 2601.07853