Meanwhile on GitHub Claude Code has over 5k bug reports, currently open.
Linux
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
LLMs generate the 0-days, then LLMs remove the 0-days. They will never run out of work!
Makes sense. Trained on software engineers working that pattern for decades.
Some good debunking here: https://www.flyingpenguin.com/the-boy-that-cried-mythos-verification-is-collapsing-trust-in-anthropic/
Ironically, the blog post sounds like it was written by AI.
LOL, Anthropic does rhyme with Titanic. cOiNcIdEnCe?
I got distracted and started searching for pictures of Hawaii beaches.
This fluff piece has quite the pie-in-the-sky attitude toward the blue-teaming applications of AI.
Some commentators predict that future AI models will unearth entirely new forms of vulnerabilities that defy our current comprehension, but we don’t think so.
How reassuring.
The defects are finite, and we are entering a world where we can finally find them all.
Could've said the same thing when enterprise anti-malware came onto the scene decades ago, but the reality was it was just another vector for the arms race between the red team and the blue team. The author seems to put a lot of stock in the whole "the blue team has access to these AI tools that the red team doesn't currently have access to" argument, which kinda ignores the fact that that reality is simply not going to last.
I could be wrong, but any article suggesting "zero-days are numbered" doesn't pass the smell test.
I agree that 0-days aren't numbered. There are so many layers on which tech can be exploited that this is a difficult claim to make.
On the other hand, there are two different kind of exploits: clear holes in the logic, a situation or code path not considered by the coder. And the much harder to catch extremely creative ways to make a program do things it was never designed to do.
I have not seen LLMs doing creative things ever, so I doubt it would catch this second category. But sure, catching some logic holes it can be helpful with.
The author seems to put a lot of stock in the whole “the blue team has access to these AI tools that the red team doesn’t currently have access to” argument
I didn't read it like that. I think the point was that the red team had an edge over the blue team (by being able to spend a lot of effort on a single exploit), so when both teams have access to these same tools, it'll be more of an equal fight.
Perhaps I misunderstood the author's intent. Though even if their position is that the red team and blue team will be on a more even playing field when both have access to AI tools, I'm not sure I can agree with that assessment. The asymmetrical nature of offense and defense isn't fundamentally changed by the advent of AI tools. While the current slate of AI tools may be uniquely more useful for finding and patching bugs, I can't imagine a future in which AI tools aren't also being tailored for exploiting and penetrating. The red team isn't just going to sit around and not adapt the available toolset to favor their use cases as well.
Much like the arms race between anti-virus development and virus development, there will be defensive AI development and offensive AI development. Similar to what we've already seen with the arms race between LLMs and software that can detect if something was written by an LLM.
I could be wrong, but any article suggesting “zero-days are numbered” doesn’t pass the smell test.
Yeah, you're right.
The real story is that it is a bit better at finding bugs. Calling them zero-days and implying there's some major security implications is just to build hype.
It was able to chain a few of the bugs together to create a RCE exploit in a weakened browser, it's interesting but don't go to your fallout shelter just yet.
Defenders finally have a chance to win, decisively
I'm curious how it will turn out to be in a long term. Are we going to have safer software? Because not only defenders will have a powerful tool, but attackers too. But at the same time, number of bugs is finite... Can we in theory one day achieve literally zero bugs in codebase?
It does seem advantageous to the defender.
Another factor Mozilla didn’t mention (and that Anthropic wouldn’t like to emphasize) is that major LLMs are pretty similar. And their development is way more conservative than you’d think. They use similar architectures and formats, train from the same data, distill each other, further pollute the internet with the same output and so on. So if (for example) Mozilla red teams with Mythos, I’d posit it’s likely that attacker LLMs would find the same already-patched bugs, instead of something new.
…So yeah. I’d wager Mozilla’s sentiment is correct.
Add to that that AI is pretty good at copying from pre-existing knowledge (like a database of known vulnerabilities) and not good at generating novel ideas (like discovering a new vulnerability), and the scales are further tilted in the defenders' favor.
Eh, I don’t totally agree. AI can discover novel exploits that aren’t already in some database, and likely have in this case.
I’m just saying the operating patterns between different LLMs are more similar than you’d expect, like similar tools from the same factory.
You can achieve zero bugs through liberal use of rm.
Some LLMs will agree with you
You can achieve the same effect with a hammer
Are we going to have safer software? Because not only defenders will have a powerful tool, but attackers too.
Probably not safer software, but the window of time for a bug being known and exploitable will be shortened greatly. Instead of 0-days, we might have 0-minutes.
That's assuming these ridiculous AI systems are rolling deployments that fast, so maybe that idea's nonsense.
Cyber security in general is going to get interesting. Breaking into protected systems often requires more patience than expertise. Attackers often get detected when they take short cuts because of laziness and overconfidence. AI agents have unfathomable patience and attention to detail.l
AI will be good at scaning for known vulnerabilities, but patience and attention to detail? Not in my experience. I use agentic coding agents for work and they are getting better, but they still will regularly get stuck in a loop of running into a bug when running tests, attempting to fix the bug in a stupid way, still erroring, trying another stupid fix, trying the first stupid fix, and so on until a human intervenes. They may be patient (as long as you pay for more tokens), but they aren't using their time wisely.
AI tends to use the "throw shit at the wall and see what sticks" approach. It's getting better at writing maintainable code, but it still will generate more-or-less spaghetti code with random unused or deprecated variables, crazy unnecessary functions, poor organization, etc... and requires lots of testing before producing something functional. Which is fine in an environment where you can iterate and clean things up. But as an attack vector, if you need 58 attempts to fully realize a vulnerability, in most secure environments you're going to get detected and blocked before you finish.
I don't disagree on the current state. However, it's not hard to foresee that attack tools will be developed that can maintain "attention" on an attack for days or weeks at a time with privately run agents. I'm sure they are out there already to some degree.
I don't really agree with the attention to detail part from my experience. AI agents love to take shortcuts from what I've seen, and you have to pay a lot of attention to what they're doing to make sure they do the right thing.
They have attention to detail, just not the right details. It’s super easy for them to get lost in a never ending train of tangents.
It is theoretically possible by using formal verification. Which is getting easier due to lean. But still impractical.
Not zero bugs, but it should help. A benefit for defenders is that they can use AI review on code before they make it public or release it in a stable release
We’ve led the industry in building and adopting Rust
Yeah, then you fired the team to pay the CEO a few million more.
How many vulnerabilities would’ve been found if we had spent several million dollars on human security researchers though?
That doesn't make sense. Don't the attackers have the same tools?
Not right now, thats the whole thing
Mythos Preview is better at finding real vulnerabilities than existing public models and, for now, only a few have access to it.
I'm aware (unfortunately) of the marketing claims and even if they might be true, as you say it is "for now". So if it's only temporary for that arm race, especially if held by a company who leaked its own code just days ago, then I have a hard time understanding why 'zero-days are numbered' because this title claims the dynamic itself is gone. That's now my understanding, especially if other models are just marginally (which is hard to prove with models, finding proper metrics) worst than it.
See comment that shared https://techcrunch.com/2026/04/21/unauthorized-group-has-gained-access-to-anthropics-exclusive-cyber-tool-mythos-report-claims just few hours ago, and that's not even sophisticated.
Anthropic and OpenAI have multiple times used this arm race rhetoric before and it worked. Their models are supposedly "too dangerous" to be released thus consequently they have to control access.
It might be true but so far what we have witnessed is that roughly equivalent models get released by others merely weeks or maybe months after, sometimes open, but the "moat" never lasted long so I'm questioning why it would be different this time.
Actually untrue. The only thing mythos added was an automatic way to exploit vulns that other models also find. I read a good article on mastodon about it. I posted it elsewhere in the thread but also here https://www.flyingpenguin.com/the-boy-that-cried-mythos-verification-is-collapsing-trust-in-anthropic/
for now
bro 3 hours wtf