VeriSim: A Configurable Framework for Evaluating Medical AI Under Realistic Patient Noise

ArXi:2604.10441v1 Announce Type: new Medical large language models (LLMs) achieve impressive performance on standardized benchmarks, yet these evaluations fail to capture the complexity of real clinical encounters where patients exhibit memory gaps, limited health literacy, anxiety, and other communication barriers. We