LongSumEval: Question-Answering Based Evaluation and Feedback-Driven Refinement for Long Document Summarization

ArXi:2604.25130v1 Announce Type: new Evaluating long document summaries remains the primary bottleneck in summarization research. Existing metrics correlate weakly with human judgments and produce aggregate scores without explaining deficiencies or guiding improvement, preventing effective refinement in applications requiring verifiable accuracy. We