scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns

📄 Abstract - scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns

Methodology bugs in scientific Python code produce plausible but incorrect results that traditional linters and static analysis tools cannot detect. Several research groups have built ML-specific linters, demonstrating that detection is feasible. Yet these tools share a sustainability problem: dependency on specific pylint or Python versions, limited packaging, and reliance on manual engineering for every new pattern. As AI-generated code increases the volume of scientific software, the need for automated methodology checking (such as detecting data leakage, incorrect cross-validation, and missing random seeds) grows. We present scicode-lint, whose two-tier architecture separates pattern design (frontier models at build time) from execution (small local model at runtime). Patterns are generated, not hand-coded; adapting to new library versions costs tokens, not engineering hours. On Kaggle notebooks with human-labeled ground truth, preprocessing leakage detection reaches 65% precision at 100% recall; on 38 published scientific papers applying AI/ML, precision is 62% (LLM-judged) with substantial variation across pattern categories; on a held-out paper set, precision is 54%. On controlled tests, scicode-lint achieves 97.7% accuracy across 66 patterns.

scicode-lint：利用大语言模型生成的模式检测科学Python代码中的方法论错误 / scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns

1️⃣ 一句话总结

这篇论文介绍了一个名为scicode-lint的新工具，它利用大语言模型自动生成检测规则，能高效发现科学计算Python代码中那些看似合理但实则错误的‘方法论漏洞’，比如数据泄露或交叉验证错误，从而解决了传统工具难以检测和手动维护规则成本高的问题。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要