菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-10
📄 Abstract - CoFEH: LLM-driven Feature Engineering Empowered by Collaborative Bayesian Hyperparameter Optimization

Feature Engineering (FE) is pivotal in automated machine learning (AutoML) but remains a bottleneck for traditional methods, which treat it as a black-box search, operating within rigid, predefined search spaces and lacking domain awareness. While Large Language Models (LLMs) offer a promising alternative by leveraging semantic reasoning to generate unbounded operators, existing methods fail to construct free-form FE pipelines, remaining confined to isolated subtasks such as feature generation. Most importantly, they are rarely optimized jointly with hyperparameter optimization (HPO) of the ML model, leading to greedy "FE-then-HPO" workflows that cannot capture strong FE-HPO interactions. In this paper, we present CoFEH, a collaborative framework that interleaves LLM-based FE and Bayesian HPO for robust end-to-end AutoML. CoFEH uses an LLM-driven FE optimizer powered by Tree of Thought (ToT) to explore flexible FE pipelines, a Bayesian optimization (BO) module to solve HPO, and a dynamic optimizer selector that realizes interleaved optimization by adaptively scheduling FE and HPO steps. Crucially, we introduce a mutual conditioning mechanism that shares context between LLM and BO, enabling mutually informed decisions. Experiments show that CoFEH not only outperforms traditional and LLM-based FE baselines, but also achieves superior end-to-end performance under joint optimization.

顶级标签: llm machine learning model training
详细标签: feature engineering hyperparameter optimization automl bayesian optimization tree of thought 或 搜索:

CoFEH:基于协作式贝叶斯超参数优化的LLM驱动特征工程 / CoFEH: LLM-driven Feature Engineering Empowered by Collaborative Bayesian Hyperparameter Optimization


1️⃣ 一句话总结

这篇论文提出了一个名为CoFEH的协作框架,它通过让大语言模型驱动的特征工程和贝叶斯超参数优化交替进行、相互共享信息,从而实现了比传统方法更优的端到端自动化机器学习性能。

源自 arXiv: 2602.09851