this post was submitted on 20 Feb 2026

648 points (97.9% liked)

Technology

82669 readers

3335 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

648

Jack Dorsey's New Company Falling Apart as It Forces Employees to Use AI (futurism.com)

submitted 3 weeks ago by throws_lemy@lemmy.nz to c/technology@lemmy.world

91 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] MagicShel@lemmy.zip 128 points 3 weeks ago* (last edited 3 weeks ago) (10 children)

At work today we had a little presentation about Claude Cowork. And I learned someone used it to write a C (maybe C++?) compiler in Rust in two weeks at a cost of $20k and it passed 99% of whatever hell test suite they use for evaluating compilers. And I had a few thoughts.

99% pass rate? Maybe that's super impressive because it's a stress test, but if 1% of my code fails to compile I think I'd be in deep shit.
20k in two weeks is a heavy burn. Imagine if what it wrote was... garbage.
"Write a compiler" is a complete project plan in three words. Find a business project that is that simple and I'll show you software that is cheaper to buy than build. We are currently working on an authentication broker service at work and we've been doing architecture and trying to get everyone to agree on a design for 2 months. There are thousands of words devoted to just the high level stuff, plus complex flow diagrams.
A compiler might be somewhat unique in the sense that there are literally thousands of test cases available - download a foss project and try to compile it. If it fails, figure out the bug and fix it. Repeat. The ERP that your boss wants you to stand up in a month has zero test coverage and is going to be chock full of bugs — if for no other reason than you haven't thought through every single edge case and neither has the AI because lots of times those are business questions.
There is not a single person who knows the code base well enough to troubleshoot any weird bugs and transient errors.

I think this is a cool thing in the abstract. But in reality, they cherry picked the best possible use case in the world and anyone expecting their custom project is going to go like this will be lighting huge piles of money on fire.

[–] pulsewidth@lemmy.world 37 points 3 weeks ago (1 children)

Agree with all points. Additionally, compilers are also incredibly well specified via ISO standards etc, and have multiple open source codebases available, eg GCC which is available in multiple builds and implementations for different versions of C and C++, and DQNEO/cc.go.

So there are many fully-functional and complete sources that Claude Cowork would have pulled routines and code from.

[–] xep@discuss.online 22 points 3 weeks ago* (last edited 3 weeks ago) (2 children)

The vibe coded compiler is likely unmaintainable, so it can't be updated when the spec changes even assuming it did work and was real. So you'd have to redo the entire thing. It's silly.

[–] exu@feditown.com 12 points 3 weeks ago (1 children)

Updates? You just vibecode a new compiler that follows the new spec

[–] MagicShel@lemmy.zip 5 points 3 weeks ago (1 children)

"I want to add a command line option that auto generates helloworld.exe"

"That'll be $21,000."

[–] teslekova@sh.itjust.works 5 points 3 weeks ago

Ah, that's the problem, we've been getting all these chatbots to generate "hellworld.exe".

[–] killabeezio@lemmy.world 1 points 3 weeks ago

Nah bro. Just tell the agent to be a super duper distinguished software developer and write no bugs and keep the code maintainable /s

[–] grue@lemmy.world 28 points 3 weeks ago (1 children)

A C compiler in two weeks is a difficult, but doable, grad school class project (especially if you use lex and yacc instead of hand-coding the parser). And I guarantee 80 hours of grad student time costs less than $20k.

Frankly, I'm not impressed with the presentation in your anecdote at all.

[–] MagicShel@lemmy.zip 6 points 3 weeks ago (1 children)

Here is the original cite that my company pulled that from if you want more details.

I've never written a compiler, nor in Rust, so I have no idea the effort involved. I'm just boggling over the price tag. I'll bet that's the cost of an entire offshore team.

[–] sukhmel@programming.dev 4 points 3 weeks ago

Yeah, the thing also has limited scope and requires some meddling to point to necessary includes as evidenced by the first issue, afair. And the code produced is subpar I heard

[–] Aceticon@lemmy.dbzer0.com 22 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

It's even simpler than that: using an LLM to write a C compiler is the same as downloading an existing open source implementation of a C compiler from the Internet, but with extra steps, as the LLM was actually fed with that code and is just re-assembling it back together but with extra bugs - plagiarism hidden behind an automated text parrot interface.

A human can beat the LLM at that by simply finding and downloading an implementation of that more than solved problem from the Internet, which at worse will take maybe 1h.

The LLM can "solve" simple and well defined problems because its basically plagiarizing existing code that solves those problems.

[–] MagicShel@lemmy.zip 11 points 3 weeks ago (1 children)

Hey, so I started this comment to disagree with you and correct some common misunderstandings that I've been fighting against for years. Instead, as I was formulating my response, I realized you're substantially right and I've been wrong — or at least my thinking was incomplete. I figured I'd mention because the common perception is arguing with strangers on the internet never accomplishes anything.

LLMs are not fundamentally the plagiarism machines everyone claims they are. If a model reproduces any substantial text verbatim, it's because the LLM is overtrained on too small of a data set and the solution is, somewhat paradoxically, to feed it more relevant text. That has been the crux of my argument for years.

That being said, Anthropic and OpenAI aren't just LLM models. They are backed by RAG pipelines which are verbatim text that gets inserted into the context when it is relevant to the task at hand. And that fact had been escaping my consideration until now. Thank you.

[–] Aceticon@lemmy.dbzer0.com 4 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

Even the LLM part might be considered Plagiarism.

Basically, unlike humans it cannot assemble an output based on logical principles (i.e. assembled a logical model of the flows in a piece of code and then translate it to code), it can only produce text based on an N-space of probabilities derived from the works of others it has "read" (i.e. fed to it during training).

That text assembling could be the machine equivalent of Inspiration (such as how most programmers will include elements they've seen from others in their code) but it could also be Plagiarism.

Ultimately it boils down to were the boundary between Inspiration and Plagiarism stands.

As I see it, if for specific tasks there is overwhelming dominance of trained weights from a handful of works (which, one would expect, would probably be the case for a C-compiler coded in Rust), then that's a lot more towards the Plagiarism side than the Inspiration side.

Granted, it's not the verbatim copying of an entire codebase that would legally been deemed Plagiarism, but if it's almost entirely a montage made up of pieces from a handful of codebases, could it not be considered a variant of Plagiarism that is incredibly hard for humans to pull off but not so for an automated system?

Note that obviously the LLM has no "intention to copy", since it has no will or cognition at all, what I'm saying is that the people who made it have intentionally made an automated system that copies elements of existing works, which normally assembles the results from very small textual elements (same as a person who has learned how letters and words work can create a unique work from letters and words) but with the awareness that in some situations that automated system they created can produce output based on an amount of sources which is very low to the point that even though it's assembling the output token by token, it's pretty much just copying whole blocks from those sources same as a human manually copying a text from a document to a different document would.

In summary, IMHO LLMs don't always plagiarize, but can sometimes do it when the number of sources that ended up creating the volume of the N-dimensional probabilistic space the LLM is following for that output is very low.

[–] MagicShel@lemmy.zip 6 points 3 weeks ago

I agree with you on a technical level. I still think LLMs are transformative of the original text and if

when the number of sources that's what ultimately created the volume of the N-dimensional probabilistic space they're following is very low.

then the solution is to feed it even more relevant data. But I appreciate your perspective. I still disagree, but I respect your point of view.

I'll give what you've written some more thought and maybe respond in greater depth later but I'm getting pulled away. Just wanted to say thanks for the detailed and thorough response.

[–] SMillerNL@piefed.social 17 points 3 weeks ago (2 children)

https://harshanu.space/en/tech/ccc-vs-gcc/ has a good overview how bad it really is

[–] MagicShel@lemmy.zip 6 points 3 weeks ago* (last edited 3 weeks ago)

Thank you. Great addition. That was a very interesting read, though I need to be more awake for reading technical writing like that 🥱.

My point about spending $20k to produce garbage, then, was actually realized in this "perfect" use case.

[–] winkerjadams@lemmy.dbzer0.com 5 points 3 weeks ago

That was interesting to read if not a bit jargon heavy. Thanks for sharing

[–] slaacaa@lemmy.world 13 points 3 weeks ago* (last edited 3 weeks ago)

Also, software development is already the best possible use case for LLMs: you need to build something abiding by a set of rules (as in a literal language, lmao), and you can immediately test if it works.

In e.g. a legal use case instead, you can jerk off to the confident sounding text you generated, then you get chewed out by the judge for having hallucinated references. Even if you have a set of rules (laws) as a guardrails, you cannot immediately test what the AI generated - and if an expert needs to read and check everything in detail, then why not just do it themselves in the same amount of time.

We can go on to business, where the rules the AI can work inside are much looser, or healthcare, where the cost of failure is extremely high. And we are not even talking about responsibilities, official accountability for decisions.

I just don’t think what is claimed for AI is there. Maybe it will be, but I don’t see it as an organic continuation of the path we’re in. We might have another dot com boom when investors realize this - LLMs will be here to stay (same as the internet did), but they will not become AGI.

[–] Evotech@lemmy.world 10 points 3 weeks ago

I also often get assigned projects where all the tests are written out beforehand and I can look at an existing implementation while I work…

[–] MBM@lemmings.world 9 points 3 weeks ago

Don't forget that there are tons of C compilers in the dataset already

[–] AeonFelis@lemmy.world 6 points 3 weeks ago (1 children)

99% pass rate? Maybe that’s super impressive because it’s a stress test, but if 1% of my code fails to compile I think I’d be in deep shit.

Also - one of the main arguments of vibe coding advocators is that you just need to check the result several times and tell the AI assistant what needs fixing. Isn't a compiler test suite ideal for such workflow? Why couldn't they just feed the test failures back to the model and tell it to fix them, iterating again and again until they get it to work 100%?

[–] dev_null@lemmy.ml 7 points 3 weeks ago

Maybe they did, that's how they got to 99%. The remaining issues are so intricate/complex the LLM just can't solve them no matter how many test cases you give it.

[–] pipe01@programming.dev 6 points 3 weeks ago (2 children)

I would be interested in knowing what language it was for sure, as there is a huge difference between a C and a C++ compiler in terms of complexity

[–] HereIAm@lemmy.world 6 points 3 weeks ago

I think this is the reported https://github.com/anthropics/claudes-c-compiler.

And here's a pretty good article about it https://arstechnica.com/ai/2026/02/sixteen-claude-ai-agents-working-together-created-a-new-c-compiler/

[–] MagicShel@lemmy.zip 3 points 3 weeks ago

I just posted where I found the source in another comment. It would have probably the information you're interested in.

[–] CaptPretentious@lemmy.world 4 points 3 weeks ago (1 children)

I wanna make sure I got this right. They used $20,000 in fees in 2 weeks to make a compiler? Also, to what end? Like what's the expected ROI on that?

[–] MagicShel@lemmy.zip 6 points 3 weeks ago (1 children)

Well it's Anthropic, creators of Claude. It's a way to show off and convince people AI can do it. $20k is what it would cost you or me, but it's just free for them.

I don't even hate AI but it's kinda sickening the way they overstate the capabilities. But let me tell you how excited the top leadership at my company is about this...

[–] AeonFelis@lemmy.world 1 points 3 weeks ago (1 children)

$20k is what it would cost you or me, but it’s just free for them.

No it isn't. This is not regular software where the bulk of the price is the licensing. With slope-as-a-service, the bulk of the price is the data center operation cost - which Anthropic is certainly not getting for free.

[–] MagicShel@lemmy.zip 1 points 3 weeks ago (1 children)

I mean there is a cost associated with it, just like there is a cost associated with having free soda in the break room, but it was free for the person doing the project. It's absorbed into operational costs.

[–] AeonFelis@lemmy.world 1 points 3 weeks ago

Considering how these companies are losing money because they subsidize these tokens - I doubt that cost is really absorbed.