AI RESEARCH

Response-Aware User Memory Selection for LLM Personalization

arXiv CS.AI

ArXi:2604.14473v1 Announce Type: new A common approach to personalization in large language models (LLMs) is to incorporate a subset of the user memory into the prompt at inference time to guide the model's generation. Existing methods select these subsets primarily using similarity between user memory items and input queries, ignoring how features actually affect the model's response distribution.