AI RESEARCH
VisualScratchpad: Inference-time Visual Concepts Analysis in Vision Language Models
arXiv CS.AI
•
ArXi:2603.07335v1 Announce Type: new High-performing vision language models still produce incorrect answers, yet their failure modes are often difficult to explain. To make model internals accessible and enable systematic debugging, we