this post was submitted on 19 Jul 2025

266 points (96.5% liked)

Not The Onion

18493 readers

1191 users here now

Welcome

We're not The Onion! Not affiliated with them in any way! Not operated by them in any way! All the news here is real!

The Rules

Posts must be:

Links to news stories from...
...credible sources, with...
...their original headlines, that...
...would make people who see the headline think, “That has got to be a story from The Onion, America’s Finest News Source.”

Please also avoid duplicates.

Comments and post content must abide by the server rules for Lemmy.world and generally abstain from trollish, bigoted, or otherwise disruptive behavior that makes this community less fun for everyone.

And that’s basically it!

founded 2 years ago

MODERATORS

kescusay@lemmy.world

266

Exhausted man defeats AI model in world coding championship (arstechnica.com)

submitted 3 months ago by kebab@endlesstalk.org to c/nottheonion@lemmy.world

20 comments fedilink hide all child comments

all 21 comments

sorted by: hot top controversial new old

[–] saltesc@lemmy.world 67 points 3 months ago* (last edited 3 months ago) (1 children)

A Polish programmer running on fumes recently accomplished what may soon become impossible: beating an advanced AI model from OpenAI

No 🤣

So long as the protocol is predictive or algorithm based, it won't ever be better at the job.

Oml. As someone that can't wait for AI to advance well enough so I don't have to fucking code anymore, I'm so sick of this bullshit narrative.

The reality headline is "Barely functional expert outperforms extremely fast junior hire still in probation period."

We already have AI that can beat this guy, but it takes so much work that it's still years and years and years from being able to cover all the things we consider everyday tasks. Let alone the complexities of coding, which require the fundamentals of "intelligence", independent planning and decision making.

It's like that other recent news when media was flabbergasted that basic coding in a 1970s Atari chess game bested an LLM. "Yeah, no shit. It's an LLM. Do you even know AI is an acronym, or is it just synonymous with magic at this point?"

Common sense is as valuable as a computer science degree on this level.

Edit: OpenAI's route to doing what they say they can do is using what they can currently do to assist the work into actually getting there... BECAUSE ITS ALWAYS BEEN A BIG FUCKING JOB. We don't have any massive breakthroughs, at best we've gotten shortcuts, but until we can physically overcome processing limitations, figure out quantum processing or similar, NO. Just, no. Ffs it takes climate destroying levels of computing power to just be at this level of absolute shit we have rn.

I am SO sick of this general public narrative because they've just been given food, that's always been there, on a plate and have nfi about how the food got onto the plate. If they did, they'd realise there's no magic. It's the same it's always been but someone decided to improve it a little for the consumer.

/rant

[–] kautau@lemmy.world 22 points 3 months ago (1 children)

so I don’t have to fucking code anymore

I too, am excited for the mines

[–] saltesc@lemmy.world 1 points 3 months ago

I expect by then, there's more to it than scraping off StackOverflow comments grounded in a "this is the best source, therefore the output is the best" fallacy.

The knowledge and logic components of intelligence would be nice-to-haves in the artificial version too.

[–] scintilla@lemmy.blahaj.zone 38 points 3 months ago* (last edited 3 months ago) (1 children)

Did I miss where they talk about how the AI "coder" worked? because based on what I've heard from programmers it just would lie the first few times. Was a human allowed to fix mistakes that the AI made?

Seems like the model was specifically tuned for this maybe?

I feel like I'm missing information as to whether or not I should be impressed by the AIs performance.

[–] derpgon@programming.dev 7 points 3 months ago* (last edited 3 months ago)

From experience: Junie, and AI agent based on Sonnet 4, performs quite well. It can even write tests and fix them if they are failing.

Not saying the quality is great, but good enough eventually work and to pass as junior code.

Not sure how good OpenAI agent is, and if they used their coding agent Codex, and if they did then was it as-is or with some tuning? Not sure, they write it was "custom agent based on o3".

They write all,the contestants have the same hardware, but did the agent run on the given machine, or in the cloud? Human brain is like 20-40W, so let's say the upper limit given he has to move his hands - did the AI agent get the same wattage? I don't think so.

[–] vane@lemmy.world 16 points 3 months ago* (last edited 3 months ago)

fucking articles these days, no link to nothing, just bunch of copy paste text hype from twitter
here's link to problem: https://img.atcoder.jp/awtf2025heuristic/en.pdf
here's link to live stream: https://www.youtube.com/watch?v=TG3ChQH61vE

edit: from stream 39:00 first solution by openai was after 15 minutes first human solution was after 38 minutes and it was 2x slower than initial openai solution

edit2: from stream 7:20:12 winner is telling live that he slept 8-10h over last 72h, openai models are crap given he don't know what he's doing there, his code is crap and he doesn't know why it's working so well

[–] Apeman42@lemmy.world 11 points 3 months ago

Then he laid down his ~~hammer~~ keyboard and died.

[–] KingOfSleep@lemmy.ca 9 points 3 months ago (2 children)

The last human code master.

[–] lordnikon@lemmy.world 21 points 3 months ago

john Henry beat that infernal machine

[–] pastermil@sh.itjust.works 2 points 3 months ago

He's gonna save as all!

[–] Sekoia@lemmy.blahaj.zone 8 points 3 months ago

Having just read the problem, I'm curious how o3 solved it (and the human too tbh). My experience with LLMs says they'd be absolute complete crap at this, it's a very hard and open-ended problem. Intuitively I'd say it would just end up doing random changes tryjng to improve its score.

I think I could write the "trivial" solution but anything beyond seems... difficult. Congrats to the winner!

[–] yarr@feddit.nl 4 points 3 months ago

I don't read this as a win. One man finished in front of OpenAI and many, many, many finished behind OpenAI. If this is the future of coding, it's bleak indeed.

The top 1% of developers will probably be OK no matter what, it's the rest of the crowd who isn't an award winning developer that are probably in trouble.

[–] FaceDeer@fedia.io 4 points 3 months ago

Getting Kasparov v. Deep Blue vibes here.

[–] applebusch@lemmy.blahaj.zone 3 points 3 months ago (3 children)

How do you win at coding?

[–] FaceDeer@fedia.io 13 points 3 months ago (1 children)

Did you read the article? It says:

The competition required contestants to solve a single complex optimization problem over 600 minutes.

[–] Witchfire@lemmy.world 17 points 3 months ago (1 children)

They had to optimize their own 5 year old code

[–] saltesc@lemmy.world 6 points 3 months ago (1 children)

Omg. The cringe levels of looking at some awful code to realise it's mine from two years ago...

[–] XTL@sopuli.xyz 2 points 3 months ago

Good time to rewrite it in Rust.

[–] charade_you_are@sh.itjust.works 3 points 3 months ago

you code like you've never coded before

[–] not_IO@lemmy.blahaj.zone 1 points 3 months ago