菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-26
📄 Abstract - From Classification to Ranking: Enhancing LLM Reasoning Capabilities for MBTI Personality Detection

Personality detection aims to measure an individual's corresponding personality traits through their social media posts. The advancements in Large Language Models (LLMs) offer novel perspectives for personality detection tasks. Existing approaches enhance personality trait analysis by leveraging LLMs to extract semantic information from textual posts as prompts, followed by training classifiers for categorization. However, accurately classifying personality traits remains challenging due to the inherent complexity of human personality and subtle inter-trait distinctions. Moreover, prompt-based methods often exhibit excessive dependency on expert-crafted knowledge without autonomous pattern-learning capacity. To address these limitations, we view personality detection as a ranking task rather than a classification and propose a corresponding reinforcement learning training paradigm. First, we employ supervised fine-tuning (SFT) to establish personality trait ranking capabilities while enforcing standardized output formats, creating a robust initialization. Subsequently, we introduce Group Relative Policy Optimization (GRPO) with a specialized ranking-based reward function. Unlike verification tasks with definitive solutions, personality assessment involves subjective interpretations and blurred boundaries between trait categories. Our reward function explicitly addresses this challenge by training LLMs to learn optimal answer rankings. Comprehensive experiments have demonstrated that our method achieves state-of-the-art performance across multiple personality detection benchmarks.

顶级标签: llm natural language processing model training
详细标签: personality detection reinforcement learning ranking task supervised fine-tuning benchmark 或 搜索:

从分类到排序:增强大语言模型在MBTI人格检测中的推理能力 / From Classification to Ranking: Enhancing LLM Reasoning Capabilities for MBTI Personality Detection


1️⃣ 一句话总结

这篇论文提出了一种新方法,通过将人格检测从分类任务转变为排序任务,并采用强化学习训练大语言模型来学习答案的优劣顺序,从而更准确地识别社交媒体文本中的人格特质。

源自 arXiv: 2601.18582