AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models

ArXi:2506.09082v3 Announce Type: replace-cross The rise of vision foundation models (VFMs) calls for systematic evaluation. A common approach pairs VFMs with large language models (LLMs) as general-purpose heads, followed by evaluation on broad Visual Question Answering (VQA) benchmarks.