AI RESEARCH
ReFACT: A Benchmark for Scientific Confabulation Detection with Positional Error Annotations
arXiv CS.CL
•
ArXi:2509.25868v3 Announce Type: replace The mechanisms underlying scientific confabulation in Large Language Models (LLMs) remain poorly understood. We