AI RESEARCH
What Makes a Good AI Review? Concern-Level Diagnostics for AI Peer Review
arXiv CS.AI
•
ArXi:2604.19998v1 Announce Type: new Evaluating AI-generated reviews by verdict agreement is widely recognized as insufficient, yet current alternatives rarely audit which concerns a system identifies, how it prioritizes them, or whether those priorities align with the review rationale that shaped the final assessment. We propose concern alignment, a diagnostic framework that evaluates AI reviews at the concern level rather than only at the verdict level.