GLM 5.1 sits alongside frontier models in my social reasoning benchmark

By /u/cjami / April 12, 2026

Still need more matches for reliable data but GLM 5.1 looks to be very competitive with other frontier models.

This uses a benchmark I made that pits LLMs against each other in autonomous games of Blood on the Clocktower (a complex social deduction game) - last screenshot shows GLM 5.1 playing as the evil team (red).

For contrast,
Claude Opus 4.6 costs $3.69 per game.
GLM 5.1 costs $0.92 per game.

With a 0% tool error rate.

Very impressive.

submitted by /u/cjami
[link] [comments]

Leave a Comment