At work today we had a little presentation about Claude Cowork. And I learned someone used it to write a C (maybe C++?) compiler in Rust in two weeks at a cost of $20k and it passed 99% of whatever hell test suite they use for evaluating compilers. And I had a few thoughts.
- 99% pass rate? Maybe that's super impressive because it's a stress test, but if 1% of my code fails to compile I think I'd be in deep shit.
- 20k in two weeks is a heavy burn. Imagine if what it wrote was... garbage.
- "Write a compiler" is a complete project plan in three words. Find a business project that is that simple and I'll show you software that is cheaper to buy than build. We are currently working on an authentication broker service at work and we've been doing architecture and trying to get everyone to agree on a design for 2 months. There are thousands of words devoted to just the high level stuff, plus complex flow diagrams.
- A compiler might be somewhat unique in the sense that there are literally thousands of test cases available - download a foss project and try to compile it. If it fails, figure out the bug and fix it. Repeat. The ERP that your boss wants you to stand up in a month has zero test coverage and is going to be chock full of bugs — if for no other reason than you haven't thought through every single edge case and neither has the AI because lots of times those are business questions.
- There is not a single person who knows the code base well enough to troubleshoot any weird bugs and transient errors.
I think this is a cool thing in the abstract. But in reality, they cherry picked the best possible use case in the world and anyone expecting their custom project is going to go like this will be lighting huge piles of money on fire.