this post was submitted on 20 Feb 2026
302 points (98.1% liked)

Technology

81611 readers
4451 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] MagicShel@lemmy.zip 68 points 7 hours ago* (last edited 7 hours ago) (5 children)

At work today we had a little presentation about Claude Cowork. And I learned someone used it to write a C (maybe C++?) compiler in Rust in two weeks at a cost of $20k and it passed 99% of whatever hell test suite they use for evaluating compilers. And I had a few thoughts.

  • 99% pass rate? Maybe that's super impressive because it's a stress test, but if 1% of my code fails to compile I think I'd be in deep shit.
  • 20k in two weeks is a heavy burn. Imagine if what it wrote was... garbage.
  • "Write a compiler" is a complete project plan in three words. Find a business project that is that simple and I'll show you software that is cheaper to buy than build. We are currently working on an authentication broker service at work and we've been doing architecture and trying to get everyone to agree on a design for 2 months. There are thousands of words devoted to just the high level stuff, plus complex flow diagrams.
  • A compiler might be somewhat unique in the sense that there are literally thousands of test cases available - download a foss project and try to compile it. If it fails, figure out the bug and fix it. Repeat. The ERP that your boss wants you to stand up in a month has zero test coverage and is going to be chock full of bugs — if for no other reason than you haven't thought through every single edge case and neither has the AI because lots of times those are business questions.
  • There is not a single person who knows the code base well enough to troubleshoot any weird bugs and transient errors.

I think this is a cool thing in the abstract. But in reality, they cherry picked the best possible use case in the world and anyone expecting their custom project is going to go like this will be lighting huge piles of money on fire.

[–] Evotech@lemmy.world 3 points 1 hour ago

I also often get assigned projects where all the tests are written out beforehand and I can look at an existing implementation while I work…

[–] slaacaa@lemmy.world 7 points 2 hours ago* (last edited 55 minutes ago)

Also, software development is already the best possible use case for LLMs: you need to build something abiding by a set of rules (as in a literal language, lmao), and you can immediately test if it works.

In e.g. a legal use case instead, you can jerk off to the confident sounding text you generated, then you get chewed out by the judge for having hallucinated references. Even if you have a set of rules (laws) as a guardrails, you cannot immediately test what the AI generated - and if an expert needs to read and check everything in detail, then why not just do it themselves in the same amount of time.

We can go on to business, where the rules the AI can work inside are much looser, or healthcare, where the cost of failure is extremely high. And we are not even talking about responsibilities, official accountability for decisions.

I just don’t think what is claimed for AI is there. Maybe it will be, but I don’t see it as an organic continuation of the path we’re in. We might have another dot com boom when investors realize this - LLMs will be here to stay (same as the internet did), but they will not become AGI.

[–] MBM@lemmings.world 6 points 3 hours ago

Don't forget that there are tons of C compilers in the dataset already

[–] grue@lemmy.world 13 points 5 hours ago

A C compiler in two weeks is a difficult, but doable, grad school class project (especially if you use lex and yacc instead of hand-coding the parser). And I guarantee 80 hours of grad student time costs less than $20k.

Frankly, I'm not impressed with the presentation in your anecdote at all.

[–] pulsewidth@lemmy.world 26 points 7 hours ago (1 children)

Agree with all points. Additionally, compilers are also incredibly well specified via ISO standards etc, and have multiple open source codebases available, eg GCC which is available in multiple builds and implementations for different versions of C and C++, and DQNEO/cc.go.

So there are many fully-functional and complete sources that Claude Cowork would have pulled routines and code from.

[–] xep@discuss.online 11 points 5 hours ago* (last edited 5 hours ago) (1 children)

The vibe coded compiler is likely unmaintainable, so it can't be updated when the spec changes even assuming it did work and was real. So you'd have to redo the entire thing. It's silly.

[–] exu@feditown.com 3 points 2 hours ago

Updates? You just vibecode a new compiler that follows the new spec