AI RESEARCH

BEAVER: An Efficient Deterministic LLM Verifier

arXiv CS.AI

ArXi:2512.05439v2 Announce Type: replace As large language models (LLMs) transition from research prototypes to production systems, practitioners often need reliable methods to verify model outputs and characterize tail risk for safe deployment. While sampling-based estimates provide an ad-hoc intuition of model behavior, they offer no sound guarantees. We present BEAVER, the first practical framework for computing deterministic, sound probability bounds on LLM satisfaction of safety properties.