Leveraging BART to Assess CS1 C++ Programming Assignments using Rubric-based Criteria

📄 Abstract - Leveraging BART to Assess CS1 C++ Programming Assignments using Rubric-based Criteria

This paper investigates rubric-aware, multitask fine-tuning of transformer models for automated grading of introductory C++ programming assignments, with the goal of producing grade predictions that better reflect instructor grading behavior than general-purpose LLMs. Using multi-semester CS1 data, student submissions are paired with numeric scores, letter-grade buckets, and assignment rubrics, then preprocessed into unified sequences for transformer input. A BART encoder-decoder with LoRA adaptation is trained to jointly predict numeric grades and grade buckets, augmented with a distribution-matching term to align predicted and empirical grade distributions, an evaluation dimension often overlooked in prior work. Experiments compare single-task and multitask training, hard one-hot versus fuzzy and boundary-based soft labels, and rubric versus no-rubric conditions, with additional T5 and pairwise-pretrained variants. Results show that multitask BART with boundary-based soft labels and rubric context achieves lower mean absolute error and stronger grade-distribution alignment than single-task, hard-label, or code-only baselines. Fully fine-tuned T5 further improves distributional fidelity, while pairwise pretraining reduces numeric error at the cost of minority-class sensitivity. Collectively, the findings suggest that calibration-aware, rubric-guided training produces more instructor-like grading behavior than accuracy-optimized alternatives.

利用BART模型基于评分标准评估CS1课程C++编程作业 / Leveraging BART to Assess CS1 C++ Programming Assignments using Rubric-based Criteria

1️⃣ 一句话总结

本文提出一种结合评分标准的BART模型微调方法，通过多任务学习同时预测分数和等级，并优化成绩分布匹配度，使自动评分更贴近教师的人工评分行为，显著优于传统单一任务或仅依赖代码的评分方法。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要