菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-18
📄 Abstract - Causally-Guided Automated Feature Engineering with Multi-Agent Reinforcement Learning

Automated feature engineering (AFE) enables AI systems to autonomously construct high-utility representations from raw tabular data. However, existing AFE methods rely on statistical heuristics, yielding brittle features that fail under distribution shift. We introduce CAFE, a framework that reformulates AFE as a causally-guided sequential decision process, bridging causal discovery with reinforcement learning-driven feature construction. Phase I learns a sparse directed acyclic graph over features and the target to obtain soft causal priors, grouping features as direct, indirect, or other based on their causal influence with respect to the target. Phase II uses a cascading multi-agent deep Q-learning architecture to select causal groups and transformation operators, with hierarchical reward shaping and causal group-level exploration strategies that favor causally plausible transformations while controlling feature complexity. Across 15 public benchmarks (classification with macro-F1; regression with inverse relative absolute error), CAFE achieves up to 7% improvement over strong AFE baselines, reduces episodes-to-convergence, and delivers competitive time-to-target. Under controlled covariate shifts, CAFE reduces performance drop by ~4x relative to a non-causal multi-agent baseline, and produces more compact feature sets with more stable post-hoc attributions. These findings underscore that causal structure, used as a soft inductive prior rather than a rigid constraint, can substantially improve the robustness and efficiency of automated feature engineering.

顶级标签: multi-agents machine learning model training
详细标签: automated feature engineering causal discovery multi-agent reinforcement learning robustness distribution shift 或 搜索:

基于因果引导与多智能体强化学习的自动化特征工程 / Causally-Guided Automated Feature Engineering with Multi-Agent Reinforcement Learning


1️⃣ 一句话总结

这篇论文提出了一个名为CAFE的新框架,它通过结合因果发现与多智能体强化学习,让AI系统能自动构建出更稳健、更高效的特征,从而在数据分布发生变化时仍能保持良好的预测性能。

源自 arXiv: 2602.16435