MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

ArXi:2603.23501v1 Announce Type: cross Vision Language Models (VLMs) are increasingly used for tasks like medical report generation and visual question answering. However, fluent diagnostic text does not guarantee safe visual understanding. In clinical practice, interpretation begins with pre-diagnostic sanity checks: verifying that the input is valid to read (correct modality and anatomy, plausible viewpoint and orientation, and no obvious integrity violations