菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-12
📄 Abstract - CRAFT: Clinical Reward-Aligned Finetuning for Medical Image Synthesis

Foundation diffusion models can generate photorealistic natural images, but adapting them to medical imaging remains challenging. In medical adaptation, limited labeled data can exacerbate hallucination-like and clinically implausible synthesis, while existing metrics such as FID or Inception Score do not quantify per-image alignment with pathology-relevant criteria. We introduce the Clinical Alignment Score (CAS), a foundation-model-based proxy for clinical alignment that evaluates generated images along four complementary dimensions beyond visual fidelity. Building on CAS, we propose Clinical Reward-Aligned Finetuning (CRAFT), a reward-based adaptation framework that transfers medical knowledge from multimodal large language models and vision-language models through label-conditioned prompt enrichment, clinical checklists, and differentiable reward optimization. Across four diverse modalities, CRAFT improves CAS and downstream classification performance over strong adaptation baselines. Beyond average CAS gains, CRAFT reduces the empirical low-alignment tail below a real-image reference threshold by 5.5-34.7% points relative to the strongest baseline, corresponding to a 20.4% average relative reduction across datasets. These results indicate fewer hallucination-like generations under CAS, and are corroborated by out-of-family evaluator evaluation, structured checklist auditing, memorization analysis, and a blinded physician preference study on CheXpert.

顶级标签: medical multi-modal model training
详细标签: diffusion models reward finetuning clinical alignment medical image synthesis hallucination reduction 或 搜索:

CRAFT:面向医学图像合成的临床对齐奖励微调方法 / CRAFT: Clinical Reward-Aligned Finetuning for Medical Image Synthesis


1️⃣ 一句话总结

本文提出了一种名为CRAFT的微调框架,通过引入临床对齐评分(CAS)和多模态大模型的知识,在奖励优化的引导下显著减少了医学图像生成中的幻觉和不合理现象,并提升了生成图像在病理相关标准上的对齐程度。

源自 arXiv: 2605.12650