AI RESEARCH
LLMpedia: A Transparent Framework to Materialize an LLM's Encyclopedic Knowledge at Scale
arXiv CS.CL
•
ArXi:2603.24080v1 Announce Type: new Benchmarks such as MMLU suggest flagship language models approach factuality saturation, with scores above 90\%. We show this picture is incomplete. \emph{LLMpedia} generates encyclopedic articles entirely from parametric memory, producing ${\sim}$1M articles across three model families without retrieval. For gpt-5-mini, the verifiable true rate on Wikipedia-covered subjects is only 74.7\% -- than 15%age points below the benchmark-based picture, consistent with the availability bias of fixed-question evaluation.