菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-19
📄 Abstract - StruMPL: Multi-task Dense Regression under Disjoint Partial Supervision and MNAR Labels

Estimating forest aboveground biomass (AGB) from Earth observation combines two structurally incompatible label sources: spaceborne lidar provides canopy structure at millions of locations but no biomass estimate, and ground-based plots provide biomass at thousands of biased locations but no metrics of structure. No single training sample carries labels for all target variables, plot labels are missing not at random (MNAR), and biomass is linked to the structural variables by known but biome-specific allometric laws. We formalise this as multi-task dense regression under heterogeneous disjoint partial supervision with MNAR labels and inter-task physical constraints, and propose StruMPL to address it jointly. A shared encoder feeds per-variable regression, imputation, and propensity heads for spatial MNAR correction, and a learnable physics module that evaluates the inter-task constraint on the model's own predictions at every pixel. The supervised loss uses an Augmented IPW (AIPW) pseudo-outcome with stop-gradients on the propensity and on the imputation baseline; we show analytically and empirically that both are necessary for joint optimisation to recover IPW-weighted stationary points while keeping the loss bounded. On two ecologically distinct biomes, StruMPL outperforms ablation variants and the closest published method on AGB RMSE and bias, with a stratified analysis showing AIPW reduces high-AGB bias by ~54%.

顶级标签: machine learning data multi-modal
详细标签: multi-task regression partial supervision mnar labels physics-informed model aboveground biomass estimation 或 搜索:

StruMPL:面向不完整部分监督与MNAR标签的多任务密集回归方法 / StruMPL: Multi-task Dense Regression under Disjoint Partial Supervision and MNAR Labels


1️⃣ 一句话总结

本文提出StruMPL模型,通过结合共享编码器、空间倾向性修正和可学习物理约束模块,首次解决了森林生物量估算中两种异构数据源标签不匹配和选择偏差问题,在减少高值区域偏差方面取得显著效果。

源自 arXiv: 2605.19931