Data Flow Control: Data Safety Policies for AI Agents

📄 Abstract - Data Flow Control: Data Safety Policies for AI Agents

Agents increasingly generate SQL, orchestrate pipelines, and automate data analysis on behalf of users. While recent work improves query correctness, correctness is not safety. A query may be semantically valid yet violate regulatory, privacy, or business constraints that govern how data may be combined and released. We argue that enforcing such constraints is fundamentally a data infrastructure problem. This paper introduces Data Flow Control (DFC), a framework to declaratively specify and guarantee policy enforcement over tuple-level data flows within a DBMS query. A key challenge is defining a policy language that is optimizer-invariant yet efficient to enforce at scale. We formalize data safety as aggregate predicates over provenance monomials and present Passant, a portable query rewriting layer that enforces DFC policies without materializing provenance. Across five DBMS engines -- DuckDB, Umbra, PostgreSQL, DataFusion, and SQLServer -- Passant achieves ~0% overhead and outperforms alternatives by orders of magnitude. As a result, Data Flow Control is the first step towards moving data safety from prompts and post-hoc checks into the data infrastructure. Data Flow Control is available open source at this https URL.

数据流控制：面向AI代理的数据安全策略 / Data Flow Control: Data Safety Policies for AI Agents

1️⃣ 一句话总结

本文提出了一种名为数据流控制（DFC）的框架，能够直接在数据库查询系统中自动执行复杂的合规性规则（如隐私和商业约束），无需手动检查或大量计算开销，从而为AI代理在数据处理过程中提供内置的安全保障。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要