菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-26
📄 Abstract - Symbolic Regression via Latent Iterative Refinement

Symbolic regression (SR) seeks closed-form mathematical expressions that fit observed data. Neural SR methods amortize the search by training an encoder to map observations directly to expressions in a single pass, but this amortized inference leaves a residual amortization gap between its one-shot prediction and the true posterior. We propose Latent Equation Embedding (LEE), a framework that closes this gap through iterative amortized inference in a functionally grounded latent space. LEE learns a shared latent space Z equipped with three components: an encoder f_theta that jointly embeds symbolic tokens and numerical observations into a single latent vector z; an expression decoder g_expr that reconstructs formulas from z; and an evaluation decoder g_eval that predicts function values from z, explicitly grounding the latent space in functional behavior. At inference, LEE performs iterative refinement by re-encoding decoded expressions jointly with observations, progressively improving the latent estimate. LEE uses the encoder itself as a learned inference optimizer: each re-encoding step implicitly computes the mismatch between the candidate and the data. Because g_eval is differentiable in z, we additionally interleave continuous gradient descent with discrete re-encoding, yielding a hybrid iterative and gradient refinement procedure. On SRBench across three noise levels, against 19 baselines spanning genetic programming, symbolic-neural hybrids, and pre-trained Transformers, LEE produces expressions 2--10x simpler than the strongest accuracy-oriented baselines, including Operon, GP-GOMEA, TPSR, RAG-SR, and GenSR, with complexity 8--11 versus 20--90. These results advance the low-complexity region of the accuracy-complexity Pareto frontier and show graceful degradation as noise increases.

顶级标签: machine learning theory
详细标签: symbolic regression iterative refinement latent space amortized inference complexity 或 搜索:

通过潜在迭代精化实现符号回归 / Symbolic Regression via Latent Iterative Refinement


1️⃣ 一句话总结

本文提出了一种名为潜在方程嵌入(LEE)的神经符号回归方法,通过在一个功能驱动的潜在空间中进行迭代推理和混合梯度优化,相比现有方法,能以显著更简洁的数学表达式(复杂度低至8-11,而其他方法为20-90)达到同等或更优的拟合精度,并且在数据噪声增加时表现更稳健。

源自 arXiv: 2605.27245