📄
Abstract - UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation
Despite recent progress, medical foundation models still struggle to unify visual understanding and generation, as these tasks have inherently conflicting goals: semantic abstraction versus pixel-level reconstruction. Existing approaches, typically based on parameter-shared autoregressive architectures, frequently lead to compromised performance in one or both tasks. To address this, we present UniX, a next-generation unified medical foundation model for chest X-ray understanding and generation. UniX decouples the two tasks into an autoregressive branch for understanding and a diffusion branch for high-fidelity generation. Crucially, a cross-modal self-attention mechanism is introduced to dynamically guide the generation process with understanding features. Coupled with a rigorous data cleaning pipeline and a multi-stage training strategy, this architecture enables synergistic collaboration between tasks while leveraging the strengths of diffusion models for superior generation. On two representative benchmarks, UniX achieves a 46.1% improvement in understanding performance (Micro-F1) and a 24.2% gain in generation quality (FD-RadDino), using only a quarter of the parameters of LLM-CXR. By achieving performance on par with task-specific models, our work establishes a scalable paradigm for synergistic medical image understanding and generation. Codes and models are available at this https URL.
UniX:统一自回归与扩散模型用于胸部X光片的理解与生成 /
UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation
1️⃣ 一句话总结
这篇论文提出了一个名为UniX的新型医学基础模型,它通过将图像理解任务和图像生成任务分别交给自回归和扩散两个分支处理,并让它们通过跨模态注意力机制相互协作,从而在参数更少的情况下,同时实现了对胸部X光片的高质量理解和生成,性能媲美单一任务的专业模型。