← 返回列表

arXiv 提交日期: 2026-01-24

📄 Abstract - C-RADIOv4 (Tech Report)

By leveraging multi-teacher distillation, agglomerative vision backbones provide a unified student model that retains and improves the distinct capabilities of multiple teachers. In this tech report, we describe the most recent release of the C-RADIO family of models, C-RADIOv4, which builds upon AM-RADIO/RADIOv2.5 in design, offering strong improvements on key downstream tasks at the same computational complexity. We release -SO400M (412M params), and -H (631M) model variants, both trained with an updated set of teachers: SigLIP2, DINOv3, and SAM3. In addition to improvements on core metrics and new capabilities from imitating SAM3, the C-RADIOv4 model family further improves any-resolution support, brings back the ViTDet option for drastically enhanced efficiency at high-resolution, and comes with a permissive license.

顶级标签: computer vision model training multi-modal

C-RADIOv4 技术报告 / C-RADIOv4 (Tech Report)

1️⃣ 一句话总结

这篇技术报告介绍了C-RADIOv4模型，它通过整合多个先进教师模型的优势，在保持计算量不变的前提下，显著提升了多种视觉任务（如图像理解、分割）的性能，并新增了支持任意分辨率、高效高分辨率处理等实用功能。

👋 没兴趣 ☆ 感兴趣 📌 待读

打开原文 PDF

源自 arXiv: 2601.17237

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要