菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-26
📄 Abstract - Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, where each true sample $x$ is drawn from a $d$-dimensional Gaussian distribution with identity covariance, but is revealed only through the set of a partition containing $x$. When the coarse samples, roughly speaking, have ``low'' information, the mean cannot be uniquely recovered from observed samples (i.e., the problem is not identifiable). Recent work by Fotakis, Kalavasis, Kontonis, and Tzamos [FKKT21] established that sample-efficient mean estimation is possible when the unknown mean is identifiable and the partition consists of only convex sets. Moreover, they showed that without convexity, mean estimation becomes NP-hard. However, two fundamental questions remained open: (1) When is the mean identifiable under convex partitions? (2) Is computationally efficient estimation possible under identifiability and convex partitions? This work resolves both questions. [...]

顶级标签: theory machine learning model evaluation
详细标签: mean estimation coarse data identifiability convex partitions gaussian distribution 或 搜索:

从粗粒度数据估计均值:特征刻画与高效算法 / Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms


1️⃣ 一句话总结

这篇论文解决了从粗粒度观测数据(即只能看到样本所属的集合而非精确值)中高效估计高斯分布均值的两个核心问题:明确了均值可被唯一识别的条件,并给出了在此条件下的高效估计算法。

源自 arXiv: 2602.23341