菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-08
📄 Abstract - MAAM: Anchor-Preserving Compression and Contextual Calibration for Chinese Discriminatory Language Detection

Chinese discriminatory-language detection is challenging because harmful intent is often implicit and context-dependent. We propose MAAM (Myopia--Astigmatism Anchor Mechanism), a lightweight, model-agnostic framework inspired by functional visual blur: rather than preserving every token equally, MAAM retains discrimination-relevant semantic anchors and calibrates them with C--I--S contextual priors (Contextual Tone, Group Identity, and Stance Polarity). We also introduce ChLGBT, to our knowledge the first Chinese LGBT-focused discriminatory-language dataset, with 8,120 manually annotated samples and three ordinal labels: explicit bias, implicit bias, and emotional intensity. Across strong encoder baselines, MAAM improves all three prediction dimensions, with consistent gains in accuracy, F1, Brier score, and expected calibration error. Compared with frontier LLM baselines under zero-shot and few-shot prompting protocols, MAAM remains competitive while offering stronger compactness and stability. These results suggest that interpretable anchor preservation and contextual calibration provide a practical alternative to heavier model scaling for Chinese discriminatory-language assessment.

顶级标签: natural language processing model training data
详细标签: bias detection chinese language anchor mechanism contextual calibration lgbt dataset 或 搜索:

MAAM:面向中文歧视性语言检测的锚点保留压缩与上下文校准机制 / MAAM: Anchor-Preserving Compression and Contextual Calibration for Chinese Discriminatory Language Detection


1️⃣ 一句话总结

本文提出了一种轻量级、模型无关的框架MAAM,它通过模仿视觉模糊机制保留与歧视相关的语义关键信息,并结合上下文先验知识进行校准,从而在中文歧视性语言检测任务上以更小的模型规模取得与大型语言模型相媲美的性能。

源自 arXiv: 2606.09114