AI RESEARCH
REFLEX: Reference-Free Evaluation of Log Summarization via Large Language Model Judgment
arXiv CS.LG
•
ArXi:2511.07458v2 Announce Type: replace-cross Evaluating log summarization systems is challenging due to the lack of high-quality reference summaries and the limitations of existing metrics like ROUGE and BLEU, which depend on surface-level lexical overlap. We