AI RESEARCH
Fairness Evaluation and Inference Level Mitigation in LLMs
arXiv CS.AI
•
ArXi:2510.18914v3 Announce Type: replace-cross Large language models often display undesirable behaviors embedded in their internal representations, undermining fairness, inconsistency drift, amplification of harmful content, and the propagation of unwanted patterns during extended dialogue and conversations. Although