菜单

🤖 系统
📄 Abstract - SciEducator: Scientific Video Understanding and Educating via Deming-Cycle Multi-Agent System

Recent advancements in multimodal large language models (MLLMs) and video agent systems have significantly improved general video understanding. However, when applied to scientific video understanding and educating, a domain that demands external professional knowledge integration and rigorous step-wise reasoning, existing approaches often struggle. To bridge this gap, we propose SciEducator, the first iterative self-evolving multi-agent system for scientific video comprehension and education. Rooted in the classical Deming Cycle from management science, our design reformulates its Plan-Do-Study-Act philosophy into a self-evolving reasoning and feedback mechanism, which facilitates the interpretation of intricate scientific activities in videos. Moreover, SciEducator can produce multimodal educational content tailored to specific scientific processes, including textual instructions, visual guides, audio narrations, and interactive references. To support evaluation, we construct SciVBench, a benchmark consisting of 500 expert-verified and literature-grounded science QA pairs across five categories, covering physical, chemical, and everyday phenomena. Extensive experiments demonstrate that SciEducator substantially outperforms leading closed-source MLLMs (e.g., Gemini, GPT-4o) and state-of-the-art video agents on the benchmark, establishing a new paradigm for the community.

顶级标签: multi-agents multi-modal model evaluation
详细标签: scientific video understanding multimodal education deming cycle benchmark video comprehension 或 搜索:

📄 论文总结

SciEducator:基于戴明循环多智能体系统的科学视频理解与教育 / SciEducator: Scientific Video Understanding and Educating via Deming-Cycle Multi-Agent System


1️⃣ 一句话总结

这篇论文提出了一个名为SciEducator的多智能体系统,它利用戴明循环的自我进化机制来深入理解科学视频并自动生成多模态教育内容,在专业科学问答基准测试中显著优于现有先进模型。


📄 打开原文 PDF