Progressive Autonomy as Preference Learning: A Formalization of Trust Calibration for Agentic Tool Use

ArXi:2605.19151v1 Announce Type: new We formalize trust calibration for agentic tool use (deciding when an automated agent's proposed action may execute autonomously versus require human approval) as a preference-learning problem. A policy gateway maintains a Gaussian-process posterior over a latent human risk-tolerance function, observed through a probit likelihood on binary approve/deny feedback, and escalates to the human exactly where the approval outcome is most uncertain.