AI RESEARCH

NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment

arXiv CS.AI

ArXi:2604.11543v1 Announce Type: cross Novelty is a core requirement in academic publishing and a central focus of peer review, yet the growing volume of submissions has placed increasing pressure on human reviewers. While large language models (LLMs), including those fine-tuned on peer review data, have shown promise in generating review comments, the absence of a dedicated benchmark has limited systematic evaluation of their ability to assess research novelty. To address this gap, we.