Hierarchical Alignment: Enforcing Hierarchical Instruction-Following in LLMs through Logical Consistency

ArXi:2604.09075v1 Announce Type: new Large language models increasingly operate under multiple instructions from heterogeneous sources with different authority levels, including system policies, user requests, tool outputs, and retrieved context. While prior work on instruction hierarchy highlights the importance of respecting instruction priorities, it mainly focuses on adversarial attacks and overlooks the benign but common instruction conflicts that arise in real-world applications.