AI RESEARCH
PRISM Risk Signal Framework: Hierarchy-Based Red Lines for AI Behavioral Risk
arXiv CS.AI
•
ArXi:2604.11070v1 Announce Type: new Current approaches to AI safety define red lines at the case level: specific prompts, specific outputs, specific harms. This paper argues that red lines can be set fundamentally -- at the level of value, evidence, and source hierarchies that govern AI reasoning. Using the PRISM (Profile-based Reasoning Integrity Stack Measurement) framework, we define a taxonomy of 27 behavioral risk signals derived from structural anomalies in how AI systems prioritize values (L4), weight evidence types (L3), and trust information sources (L2.