AI RESEARCH
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics
arXiv CS.CL
•
ArXi:2605.18548v1 Announce Type: new Large language models (LLMs) deployed in real-world agentic applications must be capable of replanning and adapting when mid-task disruptions invalidate their prior decisions. Existing dynamic benchmarks primarily measure whether LLMs can detect temporal changes in a timely manner, leaving the complementary challenge of adaptive replanning under spatio-temporal dynamics largely unexplored. We