WeSearch

Will It Mythos?

I've done some things· ·10 min read · 0 reactions · 0 comments · 45 views
#security#ai#benchmarking
⚡ TL;DR · AI summary

The article discusses the effectiveness of Mythos, a tool for finding security bugs, and its comparison to other models. The author created a benchmark suite to test the abilities of different models in identifying bugs. The results show that while some models performed better than others, all of them did worse than expected in finding the bugs.

Key facts
Original article
I've done some things · I've done some things
Read full at I've done some things →
Opening excerpt (first ~120 words) tap to expand

Will It Mythos?May 30, 2026OK, so Mythos finds really challenging security bugs, right? That’s why it’s cordoned off from the hoi polloi, to protect the world from such a powerful finder of exploits.I am skeptical of the reasons given publicly, I suspect it’s really just so much more expensive to operate than their current models that they don’t want to offer it broadly, yet, given the difficulty they’ve had growing capacity to keep up with use. But, are they telling the truth about how good it is at finding security vulnerabilities or is it just more hype?A while back, I built a tool to automate bug hunting in my own projects called Nelson, and I’d already noticed there are surprising differences in the various models and how effectively they identify bugs. But, I wanted hard numbers.

Excerpt limited to ~120 words for fair-use compliance. The full article is at I've done some things.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments