Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

ArXi:2511.21678v2 Announce Type: replace-cross MLLMs exhibit strong reasoning on isolated queries, yet they operate de novo -- solving each problem independently and often repeating the same mistakes. Existing memory-augmented agents mainly past trajectories for reuse. However, trajectory-based memory suffers from brevity bias, gradually losing essential domain knowledge. critically, even in truly multimodal problem-solving settings, it records only a single-modality trace of past behavior, failing to preserve how visual attention and logical reasoning jointly contributed to the solution.