this post was submitted on 09 Jun 2025

823 points (91.9% liked)

Technology

76388 readers

2614 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

823

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic (www.tomshardware.com)

submitted 4 months ago by Lifecoach5000@lemmy.world to c/technology@lemmy.world

212 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] FMT99@lemmy.world 289 points 4 months ago (12 children)

Did the author thinks ChatGPT is in fact an AGI? It's a chatbot. Why would it be good at chess? It's like saying an Atari 2600 running a dedicated chess program can beat Google Maps at chess.

[–] spankmonkey@lemmy.world 230 points 4 months ago (15 children)

AI including ChatGPT is being marketed as super awesome at everything, which is why that and similar AI is being forced into absolutely everything and being sold as a replacement for people.

Something marketed as AGI should be treated as AGI when proving it isn't AGI.

[–] pelespirit@sh.itjust.works 14 points 4 months ago (11 children)

Not to help the AI companies, but why don't they program them to look up math programs and outsource chess to other programs when they're asked for that stuff? It's obvious they're shit at it, why do they answer anyway? It's because they're programmed by know-it-all programmers, isn't it.

[–] rebelsimile@sh.itjust.works 29 points 4 months ago (1 children)

Because they’re fucking terrible at designing tools to solve problems, they are obviously less and less good at pretending this is an omnitool that can do everything with perfect coherency (and if it isn’t working right it’s because you’re not believing or paying hard enough)

[–] MrJgyFly@lemmy.world 7 points 4 months ago

Or they keep telling you that you just have to wait it out. It’s going to get better and better!

[–] ImplyingImplications@lemmy.ca 26 points 4 months ago

why don't they program them

AI models aren't programmed traditionally. They're generated by machine learning. Essentially the model is given test prompts and then given a rating on its answer. The model's calculations will be adjusted so that its answer to the test prompt will be closer to the expected answer. You repeat this a few billion times with a few billion prompts and you will have generated a model that scores very high on all test prompts.

Then someone asks it how many R's are in strawberry and it gets the wrong answer. The only way to fix this is to add that as a test prompt and redo the machine learning process which takes an enormous amount of time and computational power each time it's done, only for people to once again quickly find some kind of prompt it doesn't answer well.

There are already AI models that play chess incredibly well. Using machine learning to solve a complexe problem isn't the issue. It's trying to get one model to be good at absolutely everything.

[–] PixelatedSaturn@lemmy.world 7 points 4 months ago

...or a simple counter to count the r in strawberry. Because that's more difficult than one might think and they are starting to do this now.

[–] NobodyElse@sh.itjust.works 5 points 4 months ago

Because the LLMs are now being used to vibe code themselves.

[–] fmstrat@lemmy.nowsci.com 4 points 4 months ago

This is where MCP comes in. It's a protocol for LLMs to call standard tools. Basically the LLM would figure out the tool to use from the context, then figure out the order of parameters from those the MCP server says is available, send the JSON, and parse the response.

[–] four@lemmy.zip 4 points 4 months ago

I think they're trying to do that. But AI can still fail at that lol

[–] driving_crooner@lemmy.eco.br 4 points 4 months ago (1 children)

If you pay for chatgpt you can connect it with wolfrenalpha and it's relays the maths to it

[–] Pamasich@kbin.earth 1 points 4 months ago

I don't pay for ChatGPT and just used the Wolfram GPT. They made the custom GPTs non-paid at some point.

[–] veroxii@aussie.zone 4 points 4 months ago

They are starting to do this. Most new models support function calling and can generate code to come up with math answers etc

[–] CileTheSane@lemmy.ca 3 points 4 months ago

why don't they program them to look up math programs and outsource chess to other programs when they're asked for that stuff?

Because the AI doesn't know what it's being asked, it's just a algorithm guessing what the next word in a reply is. It has no understanding of what the words mean.

"Why doesn't the man in the Chinese room just use a calculator for math questions?"

[–] Pamasich@kbin.earth 1 points 4 months ago

why don't they program them to look up math programs and outsource chess to other programs when they're asked for that stuff?

They will, when it makes sense for what the AI is designed to do. For example, ChatGPT can outsource image generation to an AI dedicated to that. It also used to calculate math using python for me, but that doesn't seem to happen anymore, probably due to security issues with letting the AI run arbitrary python code.

ChatGPT however was not designed to play chess, so I don't see why OpenAI should invest resources into connecting it to a chess API.

I think especially since adding custom GPTs, adding this kind of stuff has become kind of unnecessary for base ChatGPT. If you want a chess engine, get a GPT which implements a Stockfish API (there seem to be several GPTs that do). For math, get the Wolfram GPT which uses Wolfram Alpha's API, or a different powerful math GPT.

load more comments (1 replies)

load more comments (14 replies)

[–] suburban_hillbilly@lemmy.ml 30 points 4 months ago (2 children)

Most people do. It's just called AI in the media everywhere and marketing works. I think online folks forget that something as simple as getting a Lemmy account by yourself puts you into the top quintile of tech literacy.

load more comments (2 replies)

[–] malwieder@feddit.org 27 points 4 months ago (1 children)

Google Maps doesn't pretend to be good at chess. ChatGPT does.

[–] whaleross@lemmy.world 6 points 4 months ago (1 children)

A toddler can pretend to be good at chess but anybody with reasonable expectations knows that they are not.

[–] MelodiousFunk@startrek.website 20 points 4 months ago (1 children)

Plot twist: the toddler has a multi-year marketing push worth tens if not hundreds of millions, which convinced a lot of people who don't know the first thing about chess that it really is very impressive, and all those chess-types are just jealous.

[–] xavier666@lemm.ee 5 points 4 months ago (1 children)

Have you tried feeding the toddler gallons of baby-food? Maybe then it can play chess

[–] baggachipz@sh.itjust.works 4 points 4 months ago (1 children)

They’ve been feeding the toddler everybody else’s baby food and claiming they have the right to.

[–] xavier666@lemm.ee 4 points 4 months ago

"If we have to ask every time before stealing a little baby food, our morbidly obese toddler cannot survive"

[–] iAvicenna@lemmy.world 16 points 4 months ago (1 children)

well so much hype has been generated around chatgpt being close to AGI that now it makes sense to ask questions like "can chatgpt prove the Riemann hypothesis"

load more comments (1 replies)

[–] Broken@lemmy.ml 10 points 4 months ago (2 children)

I agree with your general statement, but in theory since all ChatGPT does is regurgitate information back and a lot of chess is memorization of historical games and types, it might actually perform well. No, it can't think, but it can remember everything so at some point that might tip the results in it's favor.

[–] Eagle0110@lemmy.world 3 points 4 months ago* (last edited 4 months ago) (1 children)

Regurgitating an impression of, not regurgitating verbatim, that's the problem here.

Chess is 100% deterministic, so it falls flat.

[–] raltoid@lemmy.world 5 points 4 months ago* (last edited 4 months ago)

I'm guessing it's not even hard to get it to "confidently" violate the rules.

[–] FMT99@lemmy.world 1 points 4 months ago

I mean it may be possible but the complexity would be so many orders of magnitude greater. It'd be like learning chess by just memorizing all the moves great players made but without any context or understanding of the underlying strategy.

[–] TowardsTheFuture@lemmy.zip 7 points 4 months ago (1 children)

I think that’s generally the point is most people thing chat GPT is this sentient thing that knows everything and… no.

[–] PixelatedSaturn@lemmy.world 3 points 4 months ago (1 children)

Do they though? No one I talked to, not my coworkers that use it for work, not my friends, not my 72 year old mother think they are sentient.

load more comments (1 replies)

[–] adhdplantdev@lemm.ee 6 points 4 months ago (1 children)

Articles like this are good because it exposes the flaws with the ai and that it can't be trusted with complex multi step tasks.

Helps people see that think AI is close to a human that its not and its missing critical functionality

[–] FMT99@lemmy.world 4 points 4 months ago (1 children)

The problem is though that this perpetuates the idea that ChatGPT is actually an AI.

[–] adhdplantdev@lemm.ee 1 points 4 months ago

People already think chatGPT is a general AI. We need more articles like this showing is ineffectiveness at being intelligent. Besides it helps find a limitations of this technology so that we can hopefully use it to argue against every single place

[–] merdaverse@lemm.ee 5 points 4 months ago* (last edited 4 months ago) (1 children)

OpenAI has been talking about AGI for years, implying that they are getting closer to it with their products.

https://openai.com/index/planning-for-agi-and-beyond/

https://openai.com/index/elon-musk-wanted-an-openai-for-profit/

Not to even mention all the hype created by the techbros around it.

load more comments (1 replies)

[–] x00z@lemmy.world 5 points 4 months ago (2 children)

In all fairness. Machine learning in chess engines is actually pretty strong.

AlphaZero was developed by the artificial intelligence and research company DeepMind, which was acquired by Google. It is a computer program that reached a virtually unthinkable level of play using only reinforcement learning and self-play in order to train its neural networks. In other words, it was only given the rules of the game and then played against itself many millions of times (44 million games in the first nine hours, according to DeepMind).

https://www.chess.com/terms/alphazero-chess-engine

[–] jeeva@lemmy.world 2 points 4 months ago

Sure, but machine learning like that is very different to how LLMs are trained and their output.

[–] FMT99@lemmy.world 1 points 4 months ago

Oh absolutely you can apply machine learning to game strategy. But you can't expect a generalized chatbot to do well at strategic decision making for a specific game.

[–] Empricorn@feddit.nl 5 points 4 months ago (1 children)

You're not wrong, but keep in mind ChatGPT advocates, including the company itself are referring to it as AI, including in marketing. They're saying it's a complete, self-learning, constantly-evolving Artificial Intelligence that has been improving itself since release... And it loses to a 4KB video game program from 1979 that can only "think" 2 moves ahead.

[–] FMT99@lemmy.world 2 points 4 months ago

That's totally fair, the company is obviously lying, excuse me "marketing", to promote their product, that's absolutely true.

[–] saltesc@lemmy.world 3 points 4 months ago

I like referring to LLMs as VI (Virtual Intelligence from Mass Effect) since they merely give the impression of intelligence but are little more than search engines. In the end all one is doing is displaying expected results based on a popularity algorithm. However they do this inconsistently due to bad data in and limited caching.

[–] FartMaster69@lemmy.dbzer0.com 2 points 4 months ago

I mean, open AI seem to forget it isn’t.