菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-25
📄 Abstract - A^3: Towards Advertising Aesthetic Assessment

Advertising images significantly impact commercial conversion rates and brand equity, yet current evaluation methods rely on subjective judgments, lacking scalability, standardized criteria, and interpretability. To address these challenges, we present A^3 (Advertising Aesthetic Assessment), a comprehensive framework encompassing four components: a paradigm (A^3-Law), a dataset (A^3-Dataset), a multimodal large language model (A^3-Align), and a benchmark (A^3-Bench). Central to A^3 is a theory-driven paradigm, A^3-Law, comprising three hierarchical stages: (1) Perceptual Attention, evaluating perceptual image signals for their ability to attract attention; (2) Formal Interest, assessing formal composition of image color and spatial layout in evoking interest; and (3) Desire Impact, measuring desire evocation from images and their persuasive impact. Building on A^3-Law, we construct A^3-Dataset with 120K instruction-response pairs from 30K advertising images, each richly annotated with multi-dimensional labels and Chain-of-Thought (CoT) rationales. We further develop A^3-Align, trained under A^3-Law with CoT-guided learning on A^3-Dataset. Extensive experiments on A^3-Bench demonstrate that A^3-Align achieves superior alignment with A^3-Law compared to existing models, and this alignment generalizes well to quality advertisement selection and prescriptive advertisement critique, indicating its potential for broader deployment. Dataset, code, and models can be found at: this https URL.

顶级标签: multi-modal model evaluation computer vision
详细标签: aesthetic assessment advertising images multimodal llm benchmark instruction tuning 或 搜索:

A^3:面向广告美学评估的框架 / A^3: Towards Advertising Aesthetic Assessment


1️⃣ 一句话总结

这篇论文提出了一个名为A^3的综合性框架,通过理论驱动的评估范式、大规模数据集、多模态大语言模型和基准测试,旨在客观、可解释地自动评估广告图像的美学质量,以解决当前依赖主观判断的局限性。

源自 arXiv: 2603.24037