Skill-Pro: Learning Reusable Skills from Experience via Non-Parametric PPO for LLM Agents

ArXi:2602.01869v2 Announce Type: replace LLM-driven agents nstrate strong performance in sequential decision-making but often rely on on-the-fly reasoning, re-deriving solutions even in recurring scenarios. This insufficient experience reuse leads to computational redundancy and execution instability. To bridge this gap, we propose Skill-Pro, a framework that enables agents to autonomously learn reusable procedural skills from interaction experiences without parameter updates.