ID-Selection: Importance-Diversity Based Visual Token Selection for Efficient LVLM Inference

📄 Abstract - ID-Selection: Importance-Diversity Based Visual Token Selection for Efficient LVLM Inference

Recent advances have explored visual token pruning to accelerate the inference of large vision-language models (LVLMs). However, existing methods often struggle to balance token importance and diversity: importance-based methods tend to retain redundant tokens, whereas diversity-based methods may overlook informative ones. This trade-off becomes especially problematic under high reduction ratios, where preserving only a small subset of visual tokens is critical. To address this issue, we propose ID-Selection, a simple yet effective token selection strategy for efficient LVLM inference. The key idea is to couple importance estimation with diversity-aware iterative selection: each token is first assigned an importance score, after which high-scoring tokens are selected one by one while the scores of similar tokens are progressively suppressed. In this way, ID-Selection preserves informative tokens while reducing redundancy in a unified selection process. Extensive experiments across 5 LVLM backbones and 16 main benchmarks demonstrate that ID-Selection consistently achieves superior performance and efficiency, especially under extreme pruning ratios. For example, on LLaVA-1.5-7B, ID-Selection prunes 97.2% of visual tokens, retaining only 16 tokens, while reducing inference FLOPs by over 97% and preserving 91.8% of the original performance, all without additional training.

ID-选择：基于重要性-多样性的视觉令牌选择方法，用于高效的大型视觉语言模型推理 / ID-Selection: Importance-Diversity Based Visual Token Selection for Efficient LVLM Inference

1️⃣ 一句话总结

这篇论文提出了一种名为ID-Selection的新方法，它通过结合令牌的重要性和多样性，在高效压缩视觉信息的同时，大幅提升了大型视觉语言模型的推理速度，并保持了很高的性能。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要