菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-25
📄 Abstract - GeoDiv: Framework For Measuring Geographical Diversity In Text-To-Image Models

Text-to-image (T2I) models are rapidly gaining popularity, yet their outputs often lack geographical diversity, reinforce stereotypes, and misrepresent regions. Given their broad reach, it is critical to rigorously evaluate how these models portray the world. Existing diversity metrics either rely on curated datasets or focus on surface-level visual similarity, limiting interpretability. We introduce GeoDiv, a framework leveraging large language and vision-language models to assess geographical diversity along two complementary axes: the Socio-Economic Visual Index (SEVI), capturing economic and condition-related cues, and the Visual Diversity Index (VDI), measuring variation in primary entities and backgrounds. Applied to images generated by models such as Stable Diffusion and FLUX.1-dev across $10$ entities and $16$ countries, GeoDiv reveals a consistent lack of diversity and identifies fine-grained attributes where models default to biased portrayals. Strikingly, depictions of countries like India, Nigeria, and Colombia are disproportionately impoverished and worn, reflecting underlying socio-economic biases. These results highlight the need for greater geographical nuance in generative models. GeoDiv provides the first systematic, interpretable framework for measuring such biases, marking a step toward fairer and more inclusive generative systems. Project page: this https URL

顶级标签: aigc model evaluation multi-modal
详细标签: text-to-image geographical bias diversity metrics fairness evaluation vision-language models 或 搜索:

GeoDiv:用于衡量文本到图像模型地理多样性的框架 / GeoDiv: Framework For Measuring Geographical Diversity In Text-To-Image Models


1️⃣ 一句话总结

这篇论文提出了一个名为GeoDiv的新框架,它利用大语言和视觉语言模型来系统评估文本生成图像模型(如Stable Diffusion)在描绘不同国家和地区时存在的偏见和缺乏多样性问题,发现模型倾向于对某些发展中国家(如印度、尼日利亚)产生贫困、破旧的刻板描绘。

源自 arXiv: 2602.22120