菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-13
📄 Abstract - Temporal Flattening in LLM-Generated Text: Comparing Human and LLM Writing Trajectories

Large language models (LLMs) are increasingly used in daily applications, from content generation to code writing, where each interaction treats the model as stateless, generating responses independently without memory. Yet human writing is inherently longitudinal: authors' styles and cognitive states evolve across months and years. This raises a central question: can LLMs reproduce such temporal structure across extended time periods? We construct and publicly release a longitudinal dataset of 412 human authors and 6,086 documents spanning 2012--2024 across three domains (academic abstracts, blogs, news) and compare them to trajectories generated by three representative LLMs under standard and history-conditioned generation settings. Using drift and variance-based metrics over semantic, lexical, and cognitive-emotional representations, we find temporal flattening in LLM-generated text. LLMs produce greater lexical diversity but exhibit substantially reduced semantic and cognitive-emotional drift relative to humans. These differences are highly predictive: temporal variability patterns alone achieve 94% accuracy and 98% ROC-AUC in distinguishing human from LLM trajectories. Our results demonstrate that temporal flattening persists regardless of whether LLMs generate independently or with access to incremental history, revealing a fundamental property of current deployment paradigms. This gap has direct implications for applications requiring authentic temporal structure, such as synthetic training data and longitudinal text modeling.

顶级标签: llm natural language processing model evaluation
详细标签: temporal analysis text generation human vs llm writing trajectories longitudinal dataset 或 搜索:

大语言模型生成文本的时间扁平化:比较人类与LLM的写作轨迹 / Temporal Flattening in LLM-Generated Text: Comparing Human and LLM Writing Trajectories


1️⃣ 一句话总结

这篇论文通过对比人类和AI在长时间跨度下的写作轨迹,发现当前大语言模型生成的文本缺乏人类写作中随时间的自然演变,呈现出一种‘时间扁平化’特征,仅凭这种时间变化模式就能以极高准确度区分人写和AI生成的内容。

源自 arXiv: 2604.12097