SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation

ArXi:2508.15658v5 Announce Type: replace The rapid growth of academic literature makes the manual creation of scientific surveys increasingly infeasible. While large language models show promise for automating this process, progress in this area is hindered by the absence of standardized benchmarks and evaluation protocols. To bridge this critical gap, we