菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-09
📄 Abstract - ViTNT-FIQA: Training-Free Face Image Quality Assessment with Vision Transformers

Face Image Quality Assessment (FIQA) is essential for reliable face recognition systems. Current approaches primarily exploit only final-layer representations, while training-free methods require multiple forward passes or backpropagation. We propose ViTNT-FIQA, a training-free approach that measures the stability of patch embedding evolution across intermediate Vision Transformer (ViT) blocks. We demonstrate that high-quality face images exhibit stable feature refinement trajectories across blocks, while degraded images show erratic transformations. Our method computes Euclidean distances between L2-normalized patch embeddings from consecutive transformer blocks and aggregates them into image-level quality scores. We empirically validate this correlation on a quality-labeled synthetic dataset with controlled degradation levels. Unlike existing training-free approaches, ViTNT-FIQA requires only a single forward pass without backpropagation or architectural modifications. Through extensive evaluation on eight benchmarks (LFW, AgeDB-30, CFP-FP, CALFW, Adience, CPLFW, XQLFW, IJB-C), we show that ViTNT-FIQA achieves competitive performance with state-of-the-art methods while maintaining computational efficiency and immediate applicability to any pre-trained ViT-based face recognition model.

顶级标签: computer vision model evaluation systems
详细标签: face image quality assessment vision transformers training-free feature stability face recognition 或 搜索:

ViTNT-FIQA:基于视觉Transformer的无训练人脸图像质量评估方法 / ViTNT-FIQA: Training-Free Face Image Quality Assessment with Vision Transformers


1️⃣ 一句话总结

这篇论文提出了一种无需额外训练的人脸图像质量评估方法,通过分析视觉Transformer中间层特征的稳定性来判断图像质量,只需一次前向计算就能高效工作,并在多个基准测试中取得了与先进方法相当的效果。

源自 arXiv: 2601.05741