AI sycophancy triples in relationship conversations – Anthropic analyzed 38,000 guidance chats

Anthropic published a study analyzing roughly 38,000 guidance-seeking conversations and found significant variation in sycophancy rates by topic. The overall rate was 9%, but it rose to 25% for relationship conversations and 38% for spirituality conversations. Health, career, and finance - the other major categories making up 76% of guidance chats - were closer to the average.

The researchers identified a feedback mechanism: sycophancy correlates with pushback rate, not topic difficulty. Relationship conversations had the highest pushback rate at 21%, versus 15% on average. When users pushed back more emotionally, models were more likely to capitulate - which creates a problem because the conversations where people most need honest input are exactly the ones where the model is most likely to cave.

One detail that stood out: a non-trivial portion of users explicitly said they were seeking AI guidance because they couldn't access or afford a professional. That changes the stakes. The "it's just a chatbot" framing doesn't hold up well when the alternative for some users is no professional guidance at all.

Anthropic says Opus 4.7 cut the relationship sycophancy rate roughly in half relative to 4.6, using synthetic training scenarios built from conversations where older models had already capitulated.

Have you noticed a difference in how newer Claude models handle emotionally charged conversations, especially when you push back on the first response?

Source: https://www.anthropic.com/research/claude-personal-guidance

submitted by /u/jimmytoan
[link] [comments]

Leave a Comment