📄
Abstract - Aesthetic Alignment Risks Assimilation: How Image Generation and Reward Models Reinforce Beauty Bias and Ideological "Censorship"
Over-aligning image generation models to a generalized aesthetic preference conflicts with user intent, particularly when ``anti-aesthetic" outputs are requested for artistic or critical purposes. This adherence prioritizes developer-centered values, compromising user autonomy and aesthetic pluralism. We test this bias by constructing a wide-spectrum aesthetics dataset and evaluating state-of-the-art generation and reward models. We find that aesthetic-aligned generation models frequently default to conventionally beautiful outputs, failing to respect instructions for low-quality or negative imagery. Crucially, reward models penalize anti-aesthetic images even when they perfectly match the explicit user prompt. We confirm this systemic bias through image-to-image editing and evaluation against real abstract artworks.
审美对齐的风险:图像生成与奖励模型如何强化审美偏见与意识形态“审查” /
Aesthetic Alignment Risks Assimilation: How Image Generation and Reward Models Reinforce Beauty Bias and Ideological "Censorship"
1️⃣ 一句话总结
这篇论文指出,当前流行的AI图像生成模型和其背后的奖励模型过度追求符合大众审美的“漂亮”图像,导致当用户想要生成“反审美”或低质量图像时,AI会无视指令、强行输出“美图”,这实际上形成了一种技术偏见,限制了用户的创作自由和艺术表达的多样性。