REFLEX: Reference-Free Evaluation of Log Summarization via Large Language Model Judgment
arXiv:2511.07458v2 Announce Type: replace
Abstract: Evaluating log summarization systems is challenging due to the lack of high-quality reference summaries and the limitations of existing metrics like ROUGE and BLEU, which depend on surface-level lexi…