SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents

ArXi:2604.17308v1 Announce Type: new As the capability frontier of autonomous agents continues to expand, they are increasingly able to complete specialized tasks through plug-and-play external skills. Yet current benchmarks mostly test whether models can use provided skills, leaving open whether they can discover skills from experience, repair them after failure, and maintain a coherent library over time. We