The Geometric Canary: Predicting Steerability and Detecting Drift via Representational Stability

ArXi:2604.17698v1 Announce Type: new Reliable deployment of language models requires two capabilities that appear distinct but share a common geometric foundation: predicting whether a model will accept targeted behavioral control, and detecting when its internal structure degrades. We show that geometric stability, the consistency of a representation's pairwise distance structure, addresses both.