AI RESEARCH

In-Context Learning in Speech Language Models: Analyzing the Role of Acoustic Features, Linguistic Structure, and Induction Heads

arXiv CS.CL

ArXi:2604.06356v1 Announce Type: new In-Context Learning (ICL) has been extensively studied in text-only Language Models, but remains largely unexplored in the speech domain. Here, we investigate how linguistic and acoustic features affect ICL in Speech Language Models. We focus on the Text-to-Speech (TTS) task, which allows us to analyze ICL from two angles: (1) how accurately the model infers the task from the nstrations (i.e., generating the correct spoken content), and (2) to what extent the model mimics the acoustic characteristics of the nstration speech in its output.