先验知识还是搜索能力?面向硬件感知代码优化的LLM智能体研究 / Prior Knowledge or Search? A Study of LLM Agents in Hardware-Aware Code Optimization
1️⃣ 一句话总结
本文通过三个对照实验发现,大型语言模型在代码优化任务中主要依赖预训练阶段获得的先验知识,而非优化循环中提供的反馈或智能体结构,且在不常见的参数或低密度编程语言场景下性能会显著下降。
LLM discovery and optimization systems are increasingly applied across domains, implementing a common propose-evaluate-revise loop. Such optimization or discovery progresses via context conditioning on received feedback from an environment. However, as modern LLM agents are increasingly complex in their structure, it is difficult to evaluate which components contribute the most, and when and how this exploration may fail. We answer these questions through three controlled experiments. Our findings: (1) In pure black-box optimization, LLMs act as greedy optimizers. (2) In zero-shot kernel generation, providing explicit input-size information has no measurable effect, models converge to the same kernel parameters regardless of size or temperature, as though the size instruction were invisible. Moreover, when tasked to perform kernel optimization for uncommon kernel sizes, performance sharply degrades regardless of the language used. (3) In feedback-loop kernel optimization, CUDA improves monotonically under iterative feedback, while TVM IR actively degrades, which demonstrates that kernel optimization degrades when models operate with low-density language. Our results conclude that LLMs in code optimization tasks highly depend on pretrained priors rather than provided feedback or agentic structure.
先验知识还是搜索能力?面向硬件感知代码优化的LLM智能体研究 / Prior Knowledge or Search? A Study of LLM Agents in Hardware-Aware Code Optimization
本文通过三个对照实验发现,大型语言模型在代码优化任务中主要依赖预训练阶段获得的先验知识,而非优化循环中提供的反馈或智能体结构,且在不常见的参数或低密度编程语言场景下性能会显著下降。
源自 arXiv: 2605.19782