Value Alignment Tax: Measuring Value Trade-offs in LLM Alignment

ArXi:2602.12134v2 Announce Type: replace Existing work on value alignment typically characterizes value relations statically, ignoring how alignment interventions, such as prompting, fine-tuning, or preference optimization, reshape the broader value system. In practice, aligning a target value can implicitly shift other values, creating value trade-offs that remain largely unmeasured. We