The Ghost in the Grammar: Methodological Anthropomorphism in AI Safety Evaluations

ArXi:2603.13255v1 Announce Type: cross This essay offers a philosophical analysis of the field of AI safety based on recent technical reports, with particular focus on Anthropic's study on "agentic misalignment" in frontier language models. It examines the recurring anthropomorphism in the field: the tendency of researchers and developers to project categories such as "intention," "persona," and even "feelings" onto AI systems without adequate conceptual problematization.