菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-18
📄 Abstract - Internalizing Tool Knowledge in Small Language Models via QLoRA Fine-Tuning

Large language models are increasingly used as planning components in agentic systems, but current tool-use pipelines often require full tool schemas to be included in every prompt, creating substantial token overhead and limiting the practicality of smaller models. This paper investigates whether tool-use knowledge can be internalized into small language models through parameter-efficient fine-tuning, enabling structured planning without explicit tool descriptions at inference time. Using AssetOpsBench as the primary benchmark, we fine-tune Gemma 4 E4B and Qwen3-4B with 8-bit QLoRA on approximately 1,700 tool-use examples spanning tool knowledge, question-to-plan mappings, and execution-style traces. We evaluate the resulting models under description-free inference, where the prompt omits the tool catalog entirely. The fine-tuned models outperform an informed unfine-tuned baseline that receives full tool descriptions, reducing input length by 82.6\% while improving structural and LLM-judge planning scores. In the best Gemma run, the model achieves an AT-F1 of 0.65 and an overall judge score of 3.88, compared with 0.47 and 2.88 for the informed baseline. Qwen3-4B achieves a strong overall judge score of 3.78 while using 62\% less memory and running 2.5$\times$ faster than Gemma, though it also exhibits greater catastrophic forgetting on general multiple-choice benchmarks. Additional ablations show that LoRA rank controls a quality--retention trade-off, with $r=32$ maximizing planning quality and smaller ranks preserving more general knowledge. These results suggest that, for fixed tool catalogs, QLoRA fine-tuning can shift tool knowledge from prompt context into model weights, substantially reducing inference overhead while maintaining or improving tool-planning quality.

顶级标签: llm model training
详细标签: tool learning qlora parameter-efficient fine-tuning planning small language model 或 搜索:

通过QLoRA微调将工具知识内化到小型语言模型中 / Internalizing Tool Knowledge in Small Language Models via QLoRA Fine-Tuning


1️⃣ 一句话总结

本研究通过QLoRA微调方法,让小型语言模型在无需输入完整工具描述的情况下,自主记住并运用工具知识,不仅将提示长度缩短82.6%,还提升了规划任务的表现,证明了将工具知识从提示文本转移到模型参数中的可行性。

源自 arXiv: 2605.17774