基于最大值提名样本的分数监督分类 / Fractionally Supervised Classification with Maxima Nominated Samples
1️⃣ 一句话总结
本文提出了一种适用于最大值提名采样数据的分数监督分类方法,通过引入潜在变量来建模被观测的最大值及其所在样本集合的组成,从而修正了传统方法因忽视样本排序信息而导致的模型错误,在稀有事件检测等场景中显著提升了分类性能。
Fractionally supervised classification (FSC) offers a flexible framework for combining labeled and unlabeled data in model-based classification, but existing formulations assume simple random sampling. In many applications, however, the retained observation is an extreme order statistic from a set rather than a randomly selected unit. This is particularly appealing when the target population is rare, since maxima nomination sampling (NS) can enrich the sample with the most informative observations, as in screening, environmental monitoring, repeated testing, and reliability studies. Under such designs, the likelihood function changes fundamentally, and the usual FSC EM construction is no longer valid. We develop FSC for nominated samples by introducing a latent representation that accounts for both the class membership of the observed maximum and the latent composition of the remaining units in the set. The resulting method yields a proper EM algorithm and a coherent weighted-likelihood FSC procedure for NS data. We present the methodology in general form, illustrate it for a rare-event contamination normal mixtures, and show through simulation that it substantially improves on the misspecified alternative by ignoring the extra rank information of such data. A real-data analysis demonstrates its practical value.
基于最大值提名样本的分数监督分类 / Fractionally Supervised Classification with Maxima Nominated Samples
本文提出了一种适用于最大值提名采样数据的分数监督分类方法,通过引入潜在变量来建模被观测的最大值及其所在样本集合的组成,从而修正了传统方法因忽视样本排序信息而导致的模型错误,在稀有事件检测等场景中显著提升了分类性能。
源自 arXiv: 2604.25145