菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-18
📄 Abstract - Harnessing the Power of Foundation Models for Accurate Material Classification

Material classification has emerged as a critical task in computer vision and graphics, supporting the assignment of accurate material properties to a wide range of digital and real-world applications. While traditionally framed as an image classification task, this domain faces significant challenges due to the scarcity of annotated data, limiting the accuracy and generalizability of trained models. Recent advances in vision-language foundation models (VLMs) offer promising avenues to address these issues, yet existing solutions leveraging these models still exhibit unsatisfying results in material recognition tasks. In this work, we propose a novel framework that effectively harnesses foundation models to overcome data limitations and enhance classification accuracy. Our method integrates two key innovations: (a) a robust image generation and auto-labeling pipeline that creates a diverse and high-quality training dataset with material-centric images, and automatically assigns labels by fusing object semantics and material attributes in text prompts; (b) a prior incorporation strategy to distill information from VLMs, combined with a joint fine-tuning method that optimizes a pre-trained vision foundation model alongside VLM-derived priors, preserving broad generalizability while adapting to material-specific this http URL experiments demonstrate significant improvements on multiple datasets. We show that our synthetic dataset effectively captures the characteristics of real world materials, and the integration of priors from vision-language models significantly enhances the final performance. The source code and dataset will be released.

顶级标签: computer vision model training multi-modal
详细标签: material classification vision-language models synthetic data generation auto-labeling fine-tuning 或 搜索:

利用基础模型提升材料分类的准确性 / Harnessing the Power of Foundation Models for Accurate Material Classification


1️⃣ 一句话总结

这篇论文提出了一种新方法,通过自动生成高质量训练数据和融合视觉语言模型的先验知识,有效解决了材料分类任务中数据稀缺的难题,显著提升了分类的准确性和泛化能力。

源自 arXiv: 2603.17390