菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-01
📄 Abstract - Evidence-Gated LLM Priors for Multi-Objective Bayesian Optimization

Large language models (LLMs) are increasingly used as heuristic advisors for black-box optimization, yet their suggestions and self-reported confidence are not necessarily calibrated to downstream objective values. This issue becomes more pronounced in multi-objective Bayesian optimization, where different objectives may require different expert knowledge and where an LLM expert can be useful for one objective but misleading for another. We study how to use LLM-generated expert priors in discrete multi-objective Bayesian optimization without blindly trusting them. We propose an objective-wise reputation-market mechanism that treats each expert-objective pair as a falsifiable prior source. Expert weights are updated online from observed objective feedback, discounted over time, and gated by market-level trust. We then introduce a decoupled counterfactual gate that can use the LLM prior without confidence, use it with confidence, or abstain from the LLM prior entirely. Across controlled synthetic stress tests and three molecule optimization benchmarks with \qwenflash{}-generated expert priors, we find that dynamic objective-wise calibration improves robustness over fixed LLM priors. However, raw LLM confidence is not reliably beneficial: on ESOL, confidence is positively correlated with prediction error; on FreeSolv, confidence can help; and on Lipophilicity, ignoring confidence remains strongest. Our fixed three-arm counterfactual gate improves over the first counterfactual variant on ESOL and FreeSolv, while an attempted margin portfolio exposes a useful negative result: margin selection should be acquisition-aware rather than based only on one-step prior error.

顶级标签: llm machine learning
详细标签: bayesian optimization multi-objective expert priors confidence calibration robustness 或 搜索:

证据门控的大语言模型先验用于多目标贝叶斯优化 / Evidence-Gated LLM Priors for Multi-Objective Bayesian Optimization


1️⃣ 一句话总结

这篇论文提出一种动态校准方法,在多目标贝叶斯优化中根据实际目标反馈来评估大语言模型每项建议的可靠性,避免盲目信任其专家先验,从而提升优化鲁棒性。

源自 arXiv: 2606.01730