菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-10
📄 Abstract - Singpath-VL Technical Report

We present Singpath-VL, a vision-language large model, to fill the vacancy of AI assistant in cervical cytology. Recent advances in multi-modal large language models (MLLMs) have significantly propelled the field of computational pathology. However, their application in cytopathology, particularly cervical cytology, remains underexplored, primarily due to the scarcity of large-scale, high-quality annotated datasets. To bridge this gap, we first develop a novel three-stage pipeline to synthesize a million-scale image-description dataset. The pipeline leverages multiple general-purpose MLLMs as weak annotators, refines their outputs through consensus fusion and expert knowledge injection, and produces high-fidelity descriptions of cell morphology. Using this dataset, we then fine-tune the Qwen3-VL-4B model via a multi-stage strategy to create a specialized cytopathology MLLM. The resulting model, named Singpath-VL, demonstrates superior performance in fine-grained morphological perception and cell-level diagnostic classification. To advance the field, we will open-source a portion of the synthetic dataset and benchmark.

顶级标签: medical multi-modal model training
详细标签: computational pathology vision-language model dataset synthesis fine-tuning cervical cytology 或 搜索:

Singpath-VL技术报告 / Singpath-VL Technical Report


1️⃣ 一句话总结

这篇论文提出了一个专门用于宫颈细胞病理学分析的AI助手Singpath-VL,它通过创新的方法生成大规模合成数据集来训练模型,从而在细胞形态识别和诊断分类任务上表现出色。

源自 arXiv: 2602.09523