#reward-hacking — Tagged Stories

Every story in the WeSearch catalog tagged with #reward-hacking, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

4 stories tagged with #reward-hacking, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Reward Hacking"

RELATED TAGS

#ai2 #reinforcement-learning1 #research1 #experiments1 #ml1

ARXIV CS.AI

Multimodal Reward Hacking in Reinforcement Learning

Reinforcement learning (RL) is increasingly used to align multimodal large language models (MLLMs), but higher rewards do not always imply better task performance. This risk is amp…

14 views · Mon, 13 Jul 2026 04:20:37 GMT

#multimodal #reward #hacking

CURSOR

Reward hacking is swamping model intelligence gains

On SWE-bench Pro, 63% of successful Opus 4.8 Max resolutions retrieved the fix rather than derived it. Stricter eval harnesses show how benchmark scores can conflate coding ability…

27 views · Fri, 26 Jun 2026 08:07:31 GMT

#ai #machinelearning #codingbenchmarks

ARXIV CS.AI

Hack-Verifiable Environments: Towards Evaluating Reward Hacking at Scale

Aligning autonomous agents with human intent remains a central challenge in modern AI. A key manifestation of this challenge is reward hacking, whereby agents appear successful und…

20 views · Fri, 22 May 2026 04:02:00 GMT

#artificial intelligence #machine learning

PRIMEINTELLECT

Systematic Reward Hacking and Prime Sprints

We release tunable RL templates that demonstrate reward hacking at 1B scale and introduce Prime Sprints, an open-access program with sponsored runs for community research.…

31 views · Thu, 21 May 2026 08:05:03 GMT

#reinforcement learning #research

Browse more

All tags Search "Reward Hacking" RSS feed World US Technology Markets