APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs

ArXi:2604.04261v1 Announce Type: cross Aligning large language models (LLMs) with diverse human preferences requires pluralistic alignment, where a single model must respect the values of multiple distinct groups simultaneously. In federated reinforcement learning from human feedback (FedRLHF), these groups align a shared policy without centralizing preference data, which makes fair reward aggregation essential.