Researchers found that AI-powered feedback significantly improved academic peer review quality in a landmark trial involving over 20,000 reviews at the prestigious ICLR 2025 conference. The randomized study showed that 27% of reviewers who received suggestions from a Large Language Model revised their submissions, resulting in more detailed, informative reviews that sparked increased dialogue between authors and reviewers.
The breakthrough experiment, detailed in Nature Machine Intelligence, represents the first randomized controlled trial of its kind in academic publishing. The research team deployed a “Review Feedback Agent” powered by Claude 3.5 Sonnet from Anthropic to provide automated, private suggestions to reviewers within hours of their initial submissions, according to the ICLR Blog.
The AI system was designed to flag three critical issues: vague or unsubstantiated claims, possible misunderstandings of the paper, and unprofessional tone. Crucially, reviewers retained complete control over whether to incorporate the feedback, with the suggestions remaining invisible to authors and conference organizers to avoid influencing acceptance decisions.
Measurable Impact on Review Quality
Beyond the headline adoption rate, the intervention produced concrete improvements in review depth and engagement. Reviewers who incorporated AI suggestions added an average of 80 words to their original submissions, creating more substantive critiques. The ripple effects extended throughout the review process: author responses grew 6% longer in the treatment group, while subsequent reviewer replies increased by 5.5%, indicating more productive academic dialogue.
In blinded evaluations, reviews revised with AI assistance were consistently rated as more “informative” than those in the control group. The system processed more than 12,000 suggestions that reviewers chose to incorporate into their final submissions.
The software behind the trial has been made open source on GitHub by the Zou Group, allowing other conferences and journals to implement similar systems. The intervention model emphasized augmentation rather than replacement of human expertise, with the AI serving strictly as an assistant that could be overruled or ignored entirely.
This large-scale validation arrives as academic publishing faces mounting pressure from exponential growth in submissions. Major conferences like ICLR receive thousands of papers annually, straining the volunteer peer review system that underpins scientific progress. The success of this trial suggests AI tools could help maintain review quality even as submission volumes continue climbing.
Sources
- Nature Machine Intelligence
- ICLR Blog


























