Eqbench: Emotional Intelligence Benchmarks for LLMs
Eqbench has introduced Light EQ-Bench 3, a set of benchmarks designed to measure emotional intelligence in language models. The benchmarks evaluate models based on eight core dimensions of emotional intelligence, including empathy and social dexterity. The scoring system utilizes an Elo score derived from pair-wise comparisons of model responses.
- ▪Light EQ-Bench 3 benchmarks emotional intelligence in language models.
- ▪The evaluation includes dimensions such as empathy, social IQ, and emotional reasoning.
- ▪Scores are calculated using an Elo system based on model comparisons.
Opening excerpt (first ~120 words) tap to expand
Light EQ-Bench 3 Emotional Intelligence Benchmarks for LLMs Github | Paper | | Twitter | About 💙EQ-Bench3 | 🌀Spiral-Bench v1.2 | ✍️Longform Writing | 🎨Creative Writing v3 | ☢️Slop Score | ⚖️Judgemark v4 | 🎤BuzzBench | 🌍DiploBench | 📚Legacy Leaderboards 🌀Spiral-Bench v1.0 🎨Creative Writing v2 💗EQ-Bench v2 ⚖️Judgemark v2.1 A benchmark measuring emotional intelligence in challenging roleplays. Learn more Note: Ability scores shown in the heatmap do not contribute to the Elo score. They are "higher is higher", not "higher is better".
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Eqbench.