菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-24
📄 Abstract - PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset

We present PopResume, a population-representative resume dataset for causal fairness auditing of LLM- and VLM-based resume screening systems. Unlike existing benchmarks that rely on manually injected demographic information and outcome-level disparities, PopResume is grounded in population statistics and preserves natural attribute relationships, enabling path-specific effect (PSE)-based fairness evaluation. We decompose the effect of a protected attribute on resume scores into two paths: the business necessity path, mediated by job-relevant qualifications, and the redlining path, mediated by demographic proxies. This distinction allows auditors to separate legally permissible from impermissible sources of disparity. Evaluating four LLMs and four VLMs on PopResume's 60.8K resumes across five occupations, we identify five representative discrimination patterns that aggregate metrics fail to capture. Our results demonstrate that PSE-based evaluation reveals fairness issues masked by outcome-level measures, underscoring the need for causally-grounded auditing frameworks in AI-assisted hiring.

顶级标签: llm natural language processing model evaluation
详细标签: fairness evaluation causal inference resume screening dataset bias auditing 或 搜索:

PopResume:基于人口代表性数据集的LLM/VLM简历筛选系统因果公平性评估 / PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset


1️⃣ 一句话总结

该研究提出了一个基于真实人口统计数据的简历数据集PopResume,用于从因果关系角度评估AI简历筛选系统的公平性,发现传统评估指标会掩盖某些歧视模式,强调了因果分析在AI招聘审计中的必要性。

源自 arXiv: 2603.22714