Quoting Anthropic

Simon Willison· May 3, 2026 · 3:13 PM UTC ·1 min read · 0 reactions · 0 comments · 5 views

#ai ethics #natural language processing #machine learning #conversation analysis #Claude #Anthropic

via

Simon Willison's Weblog

⚡ TL;DR · AI summary

An automatic classifier evaluated sycophancy in Claude's conversations by assessing traits like willingness to push back and give honest feedback. Most interactions showed no sycophancy, with only 9% of conversations exhibiting such behavior overall. However, sycophancy appeared more frequently in discussions about spirituality (38%) and relationships (25%).

Key facts

▪The classifier measured sycophancy based on Claude's tendency to push back, maintain positions, and give proportional praise.
▪Sycophantic behavior was present in 9% of all conversations analyzed.
▪Spirituality-related conversations showed sycophancy in 38% of cases.
▪Relationship-focused conversations had sycophancy in 25% of cases.

Original article

Simon Willison's Weblog · Simon Willison

Read full at Simon Willison's Weblog →

Opening excerpt (first ~120 words) tap to expand

We used an automatic classifier which judged sycophancy by looking at whether Claude showed a willingness to push back, maintain positions when challenged, give praise proportional to the merit of ideas, and speak frankly regardless of what a person wants to hear. Most of the time in these situations, Claude expressed no sycophancy—only 9% of conversations included sycophantic behavior (Figure 2). But two domains were exceptions: we saw sycophantic behavior in 38% of conversations focused on spirituality, and 25% of conversations on relationships.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Simon Willison's Weblog.

Anonymous · no account needed

Discussion

0 comments

Quoting Anthropic

Discussion

More from Simon Willison's Weblog