WeSearch

Which LLM is the best at finding real vulnerabilities?

Jeremie A <lp1>· ·4 min read · 0 reactions · 0 comments · 16 views
#technology#security#artificial intelligence
Which LLM is the best at finding real vulnerabilities?
⚡ TL;DR · AI summary

A recent evaluation tested various LLMs for their ability to identify vulnerabilities in code. The models were assessed based on their accuracy and quality of reporting, with GPT-OSS and Gemma performing particularly well. The results highlighted the strengths and weaknesses of each model, especially regarding precision and the generation of duplicate vulnerabilities.

Key facts
Original article
Medium · Jeremie A <lp1>
Read full at Medium →
Opening excerpt (first ~120 words) tap to expand

Which LLM is the best at finding real vulnerabilities (Part 1)?Jeremie A <lp1>5 min read·1 hour ago--ListenSharePress enter or click to view image in full sizeA few weeks ago, I built a framework that allows me to automatically decompile and apps, binaries and audit code.I used it to find 500 actual vulns on public apps (that I'm not even sure what to do with) and now I'm using this toolset to try and find the most cost-effective LLM to do vulnerability research.I was teaching a class in Paris when I created this exercise https://github.com/lp1dev/Mybank_WebSec_Exercise/ , the assignment is simple: run and audit the application, write a penetration testing report and send it to me!The app has a list of 13 vulnerabilities that must absolutely be reported, they are the ones that should (in…

Excerpt limited to ~120 words for fair-use compliance. The full article is at Medium.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Medium