Will It Mythos?
The article discusses the effectiveness of Mythos, a tool for finding security bugs, and its comparison to other models. The author created a benchmark suite to test the abilities of different models in identifying bugs. The results show that while some models performed better than others, all of them did worse than expected in finding the bugs.
- ▪Mythos is a powerful tool for finding security bugs, but its effectiveness is being questioned by the author.
- ▪The author created a benchmark suite to compare the abilities of different models in identifying bugs.
- ▪The results of the benchmark show that all models performed worse than expected in finding the bugs, with some performing better than others.
Opening excerpt (first ~120 words) tap to expand
Will It Mythos?May 30, 2026OK, so Mythos finds really challenging security bugs, right? That’s why it’s cordoned off from the hoi polloi, to protect the world from such a powerful finder of exploits.I am skeptical of the reasons given publicly, I suspect it’s really just so much more expensive to operate than their current models that they don’t want to offer it broadly, yet, given the difficulty they’ve had growing capacity to keep up with use. But, are they telling the truth about how good it is at finding security vulnerabilities or is it just more hype?A while back, I built a tool to automate bug hunting in my own projects called Nelson, and I’d already noticed there are surprising differences in the various models and how effectively they identify bugs. But, I wanted hard numbers.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at I've done some things.