Three Roles, One Model: Role Orchestration at Inference Time to Close the Performance Gap Between Small and Large Agents

ArXi:2604.11465v1 Announce Type: new Large language model (LLM) agents show promise on realistic tool-use tasks, but deploying capable agents on modest hardware remains challenging. We study whether inference-time scaffolding alone, without any additional