Representation Without Control: Testing the Realization Effect in Language Models
The paper titled 'Representation Without Control: Testing the Realization Effect in Language Models' explores the cognitive mechanisms behind large language models. It investigates the realization effect, a phenomenon in behavioral economics, and how it relates to language model outputs. The study finds that while language models show sensitivity to prompts, their behavior does not align with human predictions regarding risk-taking based on realized gains and losses.
- ▪The study evaluates language model behavior at three levels: prompt sensitivity, internal representation readout, and causal control.
- ▪Results indicate that language models exhibit systematic condition sensitivity but do not replicate human realization-effect predictions.
- ▪The research concludes that successful latent readout does not guarantee that a model relies on a representation for decision-making.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.25151 (cs) [Submitted on 24 May 2026] Title:Representation Without Control: Testing the Realization Effect in Language Models Authors:Ciarán Walsh, Emilio Barkett View a PDF of the paper titled Representation Without Control: Testing the Realization Effect in Language Models, by Ciar\'an Walsh and 1 other authors View PDF HTML (experimental) Abstract:Large language models are increasingly used as behavioral simulators, but it remains unclear when their outputs reflect human-like cognitive mechanisms rather than prompt-sensitive surface patterns.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.