What Makes a Good AI Review? Concern-Level Diagnostics for AI Peer Review
arXiv:2604.19998v1 Announce Type: new
Abstract: Evaluating AI-generated reviews by verdict agreement is widely recognized as insufficient, yet current alternatives rarely audit which concerns a system identifies, how it prioritizes them, or whether th…