How Far Has AI Come in Liver Fibrosis Staging? A Large-Scale Real-World Dataset and Benchmark

📄 Abstract - How Far Has AI Come in Liver Fibrosis Staging? A Large-Scale Real-World Dataset and Benchmark

Despite years of methodological progress, how far AI has come in liver fibrosis staging has never been systematically evaluated under the heterogeneous, multi-center conditions that define clinical practice. To address this gap, we introduce LiFS, a large-scale dataset and benchmark derived from the MICCAI 2025 CARE-Liver challenge, comprising 610 patients across multiple centers and scanners with multi-sequence MRI. To the best of our knowledge, LiFS is the first benchmark providing complete gadoxetic acid-enhanced sequences with histopathology-confirmed annotations from diverse real-world scanners. Through systematic evaluation of 9 independently developed methods selected from 96 registered teams against in-cohort radiologist reference results, our findings address how far current AI has progressed toward clinical-level liver fibrosis staging from three complementary perspectives. First, against radiologists, the best AI methods were broadly comparable to the senior radiologist and significantly exceeded the junior radiologist in selected settings, while median AI performance generally approached junior-radiologist levels. Second, from a data perspective, cross-center heterogeneity, label imbalance, and contrast-enhanced sequence variability emerge as the dominant challenges for AI methods. Third, from a technical perspective, methodological design choices, including spatial registration, input dimensionality, multi-modal fusion strategy, and backbone architecture, appear to modulate cross-center robustness, although no single choice alone closes the gap. Overall, LiFS provides a rigorous real-world benchmark for positioning the current state of AI in liver fibrosis staging and for enabling future research on the key challenges that limit clinically reliable deployment.

AI在肝纤维化分期中走到了哪一步？一个大规模真实世界数据集与基准 / How Far Has AI Come in Liver Fibrosis Staging? A Large-Scale Real-World Dataset and Benchmark

1️⃣ 一句话总结

本研究构建了首个包含多中心、多设备MRI影像及病理金标准的大规模肝纤维化分期数据集LiFS，系统评估了9种顶尖AI方法的性能，发现最佳AI模型已接近资深放射科医生水平，但跨中心数据差异、标签不均衡和序列多样性仍是制约其临床可靠部署的核心挑战。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要