菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-05
📄 Abstract - NaiLIA: Multimodal Nail Design Retrieval Based on Dense Intent Descriptions and Palette Queries

We focus on the task of retrieving nail design images based on dense intent descriptions, which represent multi-layered user intent for nail designs. This is challenging because such descriptions specify unconstrained painted elements and pre-manufactured embellishments as well as visual characteristics, themes, and overall impressions. In addition to these descriptions, we assume that users provide palette queries by specifying zero or more colors via a color picker, enabling the expression of subtle and continuous color nuances. Existing vision-language foundation models often struggle to incorporate such descriptions and palettes. To address this, we propose NaiLIA, a multimodal retrieval method for nail design images, which comprehensively aligns with dense intent descriptions and palette queries during retrieval. Our approach introduces a relaxed loss based on confidence scores for unlabeled images that can align with the descriptions. To evaluate NaiLIA, we constructed a benchmark consisting of 10,625 images collected from people with diverse cultural backgrounds. The images were annotated with long and dense intent descriptions given by over 200 annotators. Experimental results demonstrate that NaiLIA outperforms standard methods.

顶级标签: computer vision multi-modal model evaluation
详细标签: image retrieval multimodal alignment intent understanding color palette benchmark dataset 或 搜索:

NaiLIA:基于密集意图描述和调色板查询的多模态美甲设计检索 / NaiLIA: Multimodal Nail Design Retrieval Based on Dense Intent Descriptions and Palette Queries


1️⃣ 一句话总结

这篇论文提出了一个名为NaiLIA的新方法,它能够根据用户详细描述的美甲设计意图(包括图案、装饰、主题和感觉)以及指定的颜色组合,在海量图片中准确检索出符合要求的美甲设计图。

源自 arXiv: 2603.05446