| Per https://paperreview.ai/tech-overview, the scores corr between 2 human is about 0.41 for ICLR 2025, but in my current project I am seeing a much lower corr for ICLR 2026. So I ran the metrics for both 2025 and 2026 and it is crazy. I used 2 metrics, one-vs-rest corr and half-half split corr. All data are fetched from OpenReview. I do know that top conf reviews are just a lottery now for most papers, but i nenver thought it is this bad. 2025 avg-score SD: 1.253, mean wavg-scoreer human SD: 1.186 2026 avg-score SD: 1.162, mean within-paper human SD: 1.523
[link] [comments] |