📄
Abstract - SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization
Small language models (SLMs), such as BART, can achieve summarization performance comparable to large language models (LLMs) via distillation. However, existing LLM-based ranking strategies for summary candidates suffer from instability, while classical metrics (e.g., ROUGE) are insufficient to rank high-quality summaries. To address these issues, we introduce \textbf{SCURank}, a framework that enhances summarization by leveraging \textbf{Summary Content Units (SCUs)}. Instead of relying on unstable comparisons or surface-level overlap, SCURank evaluates summaries based on the richness and semantic importance of information content. We investigate the effectiveness of SCURank in distilling summaries from multiple diverse LLMs. Experimental results demonstrate that SCURank outperforms traditional metrics and LLM-based ranking methods across evaluation measures and datasets. Furthermore, our findings show that incorporating diverse LLM summaries enhances model abstractiveness and overall distilled model performance, validating the benefits of information-centric ranking in multi-LLM distillation. The code for SCURank is available at this https URL.
SCURank:基于摘要内容单元的多候选摘要排序方法以提升摘要质量 /
SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization
1️⃣ 一句话总结
该论文提出了一种名为SCURank的新框架,通过衡量候选摘要中“信息内容单元”的丰富性和语义重要性来替代传统的表面指标(如ROUGE)或不稳定的LLM打分方法,从而更可靠地筛选出高质量摘要,并用于训练小语言模型,使其在多种数据集上取得了更好的性能。