AI RESEARCH
HINTBench: Horizon-agent Intrinsic Non-attack Trajectory Benchmark
arXiv CS.AI
•
ArXi:2604.13954v1 Announce Type: cross Existing agent-safety evaluation has focused mainly on externally induced risks. Yet agents may still enter unsafe trajectories under benign conditions. We study this complementary but underexplored setting through the lens of \emph{intrinsic} risk, where intrinsic failures remain latent, propagate across long-horizon execution, and eventually lead to high-consequence outcomes. To evaluate this setting, we