Training LLMs with Reinforcement Learning for Intent-Aware Personalized Question Answering

ArXi:2605.12645v1 Announce Type: cross Effective personalized question answering (PQA) in language models requires grounding responses in the user's underlying intent, where intent refers to the implicit ``why'' behind a query beyond its explicit wording. However, existing approaches to intent-aware personalization rely on multi-turn conversational context or rich user profiles, and do not explicitly model user intent during the reasoning process.