AI RESEARCH

AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs

arXiv CS.AI

ArXi:2509.08031v3 Announce Type: replace-cross Large Audio Language Models (LALMs) are rapidly advancing, but evaluating them remains challenging due to inefficient and non-standardized toolkits that limit fair comparison and systematic assessment.