菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-17
📄 Abstract - Optimal scenario design for climate emulation

As deep learning for physical systems continues to grow in popularity, efforts to improve generalizability have primarily focused on designing architectures that embed physical constraints. However, for machine-learning surrogate climate models (emulators), we show that the low structural diversity in existing scenarios commonly used to generate training data places a ceiling on predictive skill. Here, we examine whether training datasets themselves can be optimized to improve generalization. We introduce a method to create datasets that produce emulators capable of generalizing to new, structurally different scenarios absent from the training data. We use a differentiable Simple Climate Model (SCM) to calculate the sensitivity of emulator loss to perturbations in the training data, iteratively updating the training data to maximize emulator skill. For an SCM, training on one scenario optimized in this fashion outperforms an emulator trained on six standard ScenarioMIP pathways. We achieve this higher predictive skill despite training on a smaller dataset, finding that our emulator successfully isolates distinct physical behaviors of different climate forcing agents (e.g., greenhouse gases vs. aerosols) without single-forcing runs. We then demonstrate that scenarios optimized using an SCM, when used to drive an intermediate-complexity climate model, produce a training dataset that yields a more skillful emulator than training on ScenarioMIP outputs. Our results suggest that, in the compute-constrained environment of running full-scale climate models, generating a small number of dynamically rich scenarios provides greater marginal value for emulation and characterizing system responses than expanding the suite of traditional emissions pathways.

顶级标签: machine learning climate model training
详细标签: scenario design generalization emulators optimization data augmentation 或 搜索:

面向气候模拟的最优情景设计方法 / Optimal scenario design for climate emulation


1️⃣ 一句话总结

本文提出一种通过优化训练数据(即气候变化情景)来提升AI气候模拟器泛化能力的方法,发现使用精心设计的一个优化情景训练出的模拟器,其预测效果甚至超过用六个传统标准情景训练出的模型,从而为在计算资源有限时如何高效构建高质量气候代理模型提供了新思路。

源自 arXiv: 2606.19302