基于贝叶斯对抗多智能体框架的AI-for-Science低代码平台 / AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework
1️⃣ 一句话总结
这篇论文提出了一种结合贝叶斯推理和对抗性测试的多智能体低代码平台,旨在通过让多个AI智能体相互协作与挑战,自动生成更可靠、更符合需求的科学计算代码,从而降低非编程专家的使用门槛并提升代码质量。
Large Language Models (LLMs) demonstrate potentials for automating scientific code generation but face challenges in reliability, error propagation in multi-agent workflows, and evaluation in domains with ill-defined success metrics. We present a Bayesian adversarial multi-agent framework specifically designed for AI for Science (AI4S) tasks in the form of a Low-code Platform (LCP). Three LLM-based agents are coordinated under the Bayesian framework: a Task Manager that structures user inputs into actionable plans and adaptive test cases, a Code Generator that produces candidate solutions, and an Evaluator providing comprehensive feedback. The framework employs an adversarial loop where the Task Manager iteratively refines test cases to challenge the Code Generator, while prompt distributions are dynamically updated using Bayesian principles by integrating code quality metrics: functional correctness, structural alignment, and static analysis. This co-optimization of tests and code reduces dependence on LLM reliability and addresses evaluation uncertainty inherent to scientific tasks. LCP also streamlines human-AI collaboration by translating non-expert prompts into domain-specific requirements, bypassing the need for manual prompt engineering by practitioners without coding backgrounds. Benchmark evaluations demonstrate LCP's effectiveness in generating robust code while minimizing error propagation. The proposed platform is also tested on an Earth Science cross-disciplinary task and demonstrates strong reliability, outperforming competing models.
基于贝叶斯对抗多智能体框架的AI-for-Science低代码平台 / AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework
这篇论文提出了一种结合贝叶斯推理和对抗性测试的多智能体低代码平台,旨在通过让多个AI智能体相互协作与挑战,自动生成更可靠、更符合需求的科学计算代码,从而降低非编程专家的使用门槛并提升代码质量。
源自 arXiv: 2603.03233