DataSTORM: Deep Research on Large-Scale Databases using Exploratory Data Analysis and Data Storytelling

📄 Abstract - DataSTORM: Deep Research on Large-Scale Databases using Exploratory Data Analysis and Data Storytelling

Deep research with Large Language Model (LLM) agents is emerging as a powerful paradigm for multi-step information discovery, synthesis, and analysis. However, existing approaches primarily focus on unstructured web data, while the challenges of conducting deep research over large-scale structured databases remain relatively underexplored. Unlike web-based research, effective data-centric research requires more than retrieval and summarization and demands iterative hypothesis generation, quantitative reasoning over structured schemas, and convergence toward a coherent analytical narrative. In this paper, we present DataSTORM, an LLM-based agentic system capable of autonomously conducting research across both large-scale structured databases and internet sources. Grounded in principles from Exploratory Data Analysis and Data Storytelling, DataSTORM reframes deep research over structured data as a thesis-driven analytical process: discovering candidate theses from data, validating them through iterative cross-source investigation, and developing them into coherent analytical narratives. We evaluate DataSTORM on InsightBench, where it achieves a new state-of-the-art result with a 19.4% relative improvement in insight-level recall and 7.2% in summary-level score. We further introduce a new dataset built on ACLED, a real-world complex database, and demonstrate that DataSTORM outperforms proprietary systems such as ChatGPT Deep Research across both automated metrics and human evaluations.

DataSTORM：利用探索性数据分析和数据叙事对大规模数据库进行深度研究 / DataSTORM: Deep Research on Large-Scale Databases using Exploratory Data Analysis and Data Storytelling

1️⃣ 一句话总结

这篇论文提出了一个名为DataSTORM的AI智能体系统，它能够像做研究一样，自主地从大规模结构化数据库和互联网中发现问题、验证假设并生成连贯的数据分析报告，显著提升了深度数据研究的自动化水平。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要