SciDef: Automating Definition Extraction from Academic Literature with Large Language Models

📄 Abstract - SciDef: Automating Definition Extraction from Academic Literature with Large Language Models

Definitions are the foundation for any scientific work, but with a significant increase in publication numbers, gathering definitions relevant to any keyword has become challenging. We therefore introduce SciDef, an LLM-based pipeline for automated definition extraction. We test SciDef on DefExtra & DefSim, novel datasets of human-extracted definitions and definition-pairs' similarity, respectively. Evaluating 16 language models across prompting strategies, we demonstrate that multi-step and DSPy-optimized prompting improve extraction performance. To evaluate extraction, we test various metrics and show that an NLI-based method yields the most reliable results. We show that LLMs are largely able to extract definitions from scientific literature (86.4% of definitions from our test-set); yet future work should focus not just on finding definitions, but on identifying relevant ones, as models tend to over-generate them. Code & datasets are available at this https URL.

SciDef：利用大语言模型从学术文献中自动提取定义 / SciDef: Automating Definition Extraction from Academic Literature with Large Language Models

1️⃣ 一句话总结

这篇论文提出了一个名为SciDef的自动化工具，它利用大语言模型从海量学术文献中高效提取关键术语的定义，并通过实验证明多步骤提示和优化方法能显著提升提取的准确性，但同时也指出模型容易过度提取定义，未来需更关注定义的相关性筛选。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要