A Component-Based Survey of Interactions between Large Language Models and Multi-Armed Bandits

ArXi:2601.12945v3 Announce Type: replace-cross Large language models (LLMs) have become powerful and widely used systems for language understanding and generation, while multi-armed bandit (MAB) algorithms provide a principled framework for adaptive decision-making under uncertainty. This survey explores the potential at the intersection of these two fields. As we know, it is the first survey to systematically review the bidirectional interaction between large language models and multi-armed bandits at the component level.