线性模型在时间序列预测中能有多好? / How Good Can Linear Models Be for Time-Series Forecasting?
1️⃣ 一句话总结
本文通过优化Ridge回归的预处理策略(如历史长度、归一化和正则化),发现简单的线性模型在多个标准数据集上的预测性能可以超越复杂的Transformer、MLP和CNN模型,打破了“模型越大效果越好”的传统观念。
Time-series forecasting research has been moving steadily toward larger architectures, from specialized transformers to general-purpose foundation models, on the assumption that capacity is what unlocks accuracy. We take the opposite position: most of the gap can be closed at far lower cost by tuning preprocessing rather than scaling models. We use Ridge regression as the testbed, since it has a closed-form solution and interpretable weights, which let the optimal hyperparameters be read off the search directly. We search over context length, local normalization, regularization, and augmentation on eight standard benchmarks and find three patterns. (1) Optimal lookback is strongly series-specific and often non-monotonic in forecast horizon, with fitted power-law exponents ranging from $+0.46$ on ETTm2 to $-0.19$ on Exchange and Traffic, challenging the convention that longer horizons need longer history. (2) Normalizing over a learned trailing fraction of the context, rather than its entirety, is almost universally preferred. (3) Series within the same dataset often disagree on hyperparameters; the optimal degree of cross-series sharing varies from fully shared to fully per-series. The resulting models beat prior linear forecasters on most dataset-horizon entries and exceed Transformer, MLP, and CNN baselines on six of eight benchmarks. The optimized hyperparameters also serve as a diagnostic on the data itself, revealing structures that larger models absorb silently into their learned parameters.
线性模型在时间序列预测中能有多好? / How Good Can Linear Models Be for Time-Series Forecasting?
本文通过优化Ridge回归的预处理策略(如历史长度、归一化和正则化),发现简单的线性模型在多个标准数据集上的预测性能可以超越复杂的Transformer、MLP和CNN模型,打破了“模型越大效果越好”的传统观念。
源自 arXiv: 2606.27282