Learning to Hand Off: Provably Convergent Workflow Learning under Interface Constraints

ArXi:2605.19140v1 Announce Type: new We study workflow learning in a setting where specialized agents hand off control through a shared artifact, each agent observes only a local function of that artifact and its own private state, and no centralized learner accesses joint trajectories -- the operating regime of multi-agent LLM pipelines that span organizational, vendor, or trust boundaries.