AI RESEARCH

When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models

arXiv CS.CL

ArXi:2510.00626v2 Announce Type: replace-cross Large audio-language models (LALMs) unify speech and text processing, but their robustness in noisy real-world settings remains underexplored. We investigate how irrelevant audio, such as silence, synthetic noise, and environmental sounds, affects text reasoning tasks where audio is unnecessary. Across three text-based benchmarks, we find that even non-informative audio reduces accuracy and increases prediction volatility; the severity of interference scales with longer durations, higher amplitudes, and elevated decoding temperatures.