菜单

🤖 系统
📄 Abstract - Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking

Large Language Models (LLMs) demonstrate partial forecasting competence across social, political, and economic events. Yet, their predictive ability varies sharply with domain structure and prompt framing. We investigate how forecasting performance varies with different model families on real-world questions about events that happened beyond the model cutoff date. We analyze how context, question type, and external knowledge affect accuracy and calibration, and how adding factual news context modifies belief formation and failure modes. Our results show that forecasting ability is highly variable as it depends on what, and how, we ask.

顶级标签: llm model evaluation agents
详细标签: event forecasting prediction markets calibration error news context failure modes 或 搜索:

大型语言模型在现实世界事件预测中的能力评估 / Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking


1️⃣ 一句话总结

本研究系统评估了大型语言模型在现实世界事件预测中的能力,发现其预测性能在不同领域和提示框架下存在显著差异,并揭示了模型在引入新闻上下文后出现的系统性失败模式。


2️⃣ 论文创新点

1. 预测能力的情境依赖性分析

2. 三级数据过滤管道

3. 双条件评估框架

4. 失败模式识别与分类


3️⃣ 主要结果与价值

结果亮点

实际价值


4️⃣ 术语表

📄 打开原文 PDF