菜单

🤖 系统
📄 Abstract - DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

We introduce DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance. The key technical breakthroughs of DeepSeek-V3.2 are as follows: (1) DeepSeek Sparse Attention (DSA): We introduce DSA, an efficient attention mechanism that substantially reduces computational complexity while preserving model performance in long-context scenarios. (2) Scalable Reinforcement Learning Framework: By implementing a robust reinforcement learning protocol and scaling post-training compute, DeepSeek-V3.2 performs comparably to GPT-5. Notably, our high-compute variant, DeepSeek-V3.2-Speciale, surpasses GPT-5 and exhibits reasoning proficiency on par with Gemini-3.0-Pro, achieving gold-medal performance in both the 2025 International Mathematical Olympiad (IMO) and the International Olympiad in Informatics (IOI). (3) Large-Scale Agentic Task Synthesis Pipeline: To integrate reasoning into tool-use scenarios, we developed a novel synthesis pipeline that systematically generates training data at scale. This methodology facilitates scalable agentic post-training, yielding substantial improvements in generalization and instruction-following robustness within complex, interactive environments.

顶级标签: llm model training agents
详细标签: sparse attention reinforcement learning agent training reasoning large language model 或 搜索:

DeepSeek-V3.2:推动开源大语言模型的前沿 / DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models


1️⃣ 一句话总结

这篇论文介绍了DeepSeek-V3.2模型,它通过创新的稀疏注意力机制、可扩展的强化学习框架和大规模智能体任务合成流程,在保持高计算效率的同时,实现了媲美顶尖闭源模型的强大推理和智能体能力。


📄 打开原文 PDF