this post was submitted on 08 Jun 2025

835 points (95.4% liked)

Technology

81451 readers

4451 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

835

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well. (archive.is)

submitted 8 months ago* (last edited 8 months ago) by Allah@lemm.ee to c/technology@lemmy.world

344 comments fedilink hide all child comments

LOOK MAA I AM ON FRONT PAGE

top 50 comments

sorted by: hot top controversial new old

[–] Nanook@lemm.ee 231 points 8 months ago (55 children)

lol is this news? I mean we call it AI, but it’s just LLM and variants it doesn’t think.

[–] MNByChoice@midwest.social 77 points 8 months ago (1 children)

The "Apple" part. CEOs only care what companies say.

[–] kadup@lemmy.world 51 points 8 months ago (5 children)

Apple is significantly behind and arrived late to the whole AI hype, so of course it's in their absolute best interest to keep showing how LLMs aren't special or amazingly revolutionary.

They're not wrong, but the motivation is also pretty clear.

[–] homesweethomeMrL@lemmy.world 29 points 8 months ago

“Late to the hype” is actually a good thing. Gen AI is a scam wrapped in idiocy wrapped in a joke. That Apple is slow to ape the idiocy of microsoft is just fine.

load more comments (4 replies)

[–] Clent@lemmy.dbzer0.com 19 points 8 months ago (3 children)

Proving it matters. Science is constantly proving any other thing that people believe is obvious because people have an uncanning ability to believe things that are false. Some people will believe things long after science has proven them false.

load more comments (3 replies)

load more comments (53 replies)

[–] SoftestSapphic@lemmy.world 98 points 8 months ago (8 children)

Wow it's almost like the computer scientists were saying this from the start but were shouted over by marketing teams.

[–] zbk@lemmy.ca 22 points 8 months ago

This! Capitalism is going to be the end of us all. OpenAI has gotten away with IP Theft, disinformation regarding AI and maybe even murder of their whistle blower.

load more comments (7 replies)

[–] minoscopede@lemmy.world 69 points 8 months ago* (last edited 8 months ago) (17 children)

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

[–] Knock_Knock_Lemmy_In@lemmy.world 17 points 8 months ago (5 children)

When given explicit instructions to follow models failed because they had not seen similar instructions before.

This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.

load more comments (5 replies)

[–] theherk@lemmy.world 15 points 8 months ago

Yeah these comments have the three hallmarks of Lemmy:

AI is just autocomplete mantras.
Apple is always synonymous with bad and dumb.
Rare pockets of really thoughtful comments.

Thanks for being at least the latter.

load more comments (15 replies)

[–] mavu@discuss.tchncs.de 58 points 8 months ago

No way!

Statistical Language models don't reason?

But OpenAI, robots taking over!

[–] sev@nullterra.org 49 points 8 months ago (35 children)

Just fancy Markov chains with the ability to link bigger and bigger token sets. It can only ever kick off processing as a response and can never initiate any line of reasoning. This, along with the fact that its working set of data can never be updated moment-to-moment, means that it would be a physical impossibility for any LLM to achieve any real "reasoning" processes.

[–] kescusay@lemmy.world 18 points 8 months ago (3 children)

I can envision a system where an LLM becomes one part of a reasoning AI, acting as a kind of fuzzy "dataset" that a proper neural network incorporates and reasons with, and the LLM could be kept real-time updated (sort of) with MCP servers that incorporate anything new it learns.

But I don't think we're anywhere near there yet.

load more comments (3 replies)

load more comments (34 replies)

[–] billwashere@lemmy.world 49 points 8 months ago (13 children)

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.

load more comments (13 replies)

[–] Jhex@lemmy.world 49 points 8 months ago (1 children)

this is so Apple, claiming to invent or discover something "first" 3 years later than the rest of the market

load more comments (1 replies)

[–] brsrklf@jlai.lu 45 points 8 months ago (2 children)

You know, despite not really believing LLM "intelligence" works anywhere like real intelligence, I kind of thought maybe being good at recognizing patterns was a way to emulate it to a point...

But that study seems to prove they're still not even good at that. At first I was wondering how hard the puzzles must have been, and then there's a bit about LLM finishing 100 move towers of Hanoï (on which they were trained) and failing 4 move river crossings. Logically, those problems are very similar... Also, failing to apply a step-by-step solution they were given.

[–] auraithx@lemmy.dbzer0.com 39 points 8 months ago

This paper doesn’t prove that LLMs aren’t good at pattern recognition, it demonstrates the limits of what pattern recognition alone can achieve, especially for compositional, symbolic reasoning.

[–] technocrit@lemmy.dbzer0.com 16 points 8 months ago* (last edited 8 months ago)

Computers are awesome at "recognizing patterns" as long as the pattern is a statistical average of some possibly worthless data set. And it really helps if the computer is setup to ahead of time to recognize pre-determined patterns.

[–] Mniot@programming.dev 42 points 8 months ago

I don't think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called "complex") puzzles. Like Towers of Hanoi but with 25 discs.

The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.

The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don't have an answer for why this is, but they suspect that the reasoning doesn't scale.

[–] reksas@sopuli.xyz 37 points 8 months ago (4 children)

does ANY model reason at all?

[–] 4am@lemm.ee 34 points 8 months ago (3 children)

No, and to make that work using the current structures we use for creating AI models we’d probably need all the collective computing power on earth at once.

load more comments (3 replies)

[–] bjoern_tantau@swg-empire.de 36 points 8 months ago* (last edited 8 months ago)

[–] technocrit@lemmy.dbzer0.com 29 points 8 months ago* (last edited 8 months ago) (3 children)

Peak pseudo-science. The burden of evidence is on the grifters who claim "reason". But neither side has any objective definition of what "reason" means. It's pseudo-science against pseudo-science in a fierce battle.

load more comments (3 replies)

[–] skisnow@lemmy.ca 26 points 8 months ago (1 children)

What's hilarious/sad is the response to this article over on reddit's "singularity" sub, in which all the top comments are people who've obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don't understand AI or "reasoning". It's a weird cult.

load more comments (1 replies)

[–] vala@lemmy.world 25 points 8 months ago

No shit

[–] SplashJackson@lemmy.ca 24 points 8 months ago (1 children)

Just like me

load more comments (1 replies)

[–] technocrit@lemmy.dbzer0.com 23 points 8 months ago* (last edited 8 months ago) (5 children)

Why would they "prove" something that's completely obvious?

The burden of proof is on the grifters who have overwhelmingly been making false claims and distorting language for decades.

[–] TheRealKuni@midwest.social 33 points 8 months ago (2 children)

Why would they "prove" something that's completely obvious?

I don’t want to be critical, but I think if you step back a bit and look and what you’re saying, you’re asking why we would bother to experiment and prove what we think we know.

That’s a perfectly normal and reasonable scientific pursuit. Yes, in a rational society the burden of proof would be on the grifters, but that’s never how it actually works. It’s always the doctors disproving the cure-all, not the snake oil salesmen failing to prove their own prove their own product.

There is value in this research, even if it fits what you already believe on the subject. I would think you would be thrilled to have your hypothesis confirmed.

load more comments (2 replies)

[–] yeahiknow3@lemmings.world 23 points 8 months ago* (last edited 8 months ago) (1 children)

They’re just using the terminology that’s widespread in the field. In a sense, the paper’s purpose is to prove that this terminology is unsuitable.

load more comments (1 replies)

[–] tauonite@lemmy.world 16 points 8 months ago

That's called science

load more comments (2 replies)

[–] GaMEChld@lemmy.world 22 points 8 months ago (8 children)

Most humans don't reason. They just parrot shit too. The design is very human.

[–] elbarto777@lemmy.world 26 points 8 months ago (5 children)

LLMs deal with tokens. Essentially, predicting a series of bytes.

Humans do much, much, much, much, much, much, much more than that.

load more comments (5 replies)

load more comments (7 replies)

[–] FreakinSteve@lemmy.world 20 points 8 months ago (4 children)

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK

load more comments (4 replies)

[–] RampantParanoia2365@lemmy.world 18 points 8 months ago* (last edited 8 months ago) (2 children)

Fucking obviously. Until Data's positronic brains becomes reality, AI is not actual intelligence.

AI is not A I. I should make that a tshirt.

load more comments (2 replies)

[–] sp3ctr4l@lemmy.dbzer0.com 17 points 8 months ago* (last edited 8 months ago) (2 children)

This has been known for years, this is the default assumption of how these models work.

You would have to prove that some kind of actual reasoning capacity has arisen as... some kind of emergent complexity phenomenon.... not the other way around.

Corpos have just marketed/gaslit us/themselves so hard that they apparently forgot this.

load more comments (2 replies)

[–] flandish@lemmy.world 17 points 8 months ago

stochastic parrots. all of them. just upgraded “soundex” models.

this should be no surprise, of course!

[–] Auli@lemmy.ca 15 points 8 months ago

No shit. This isn't new.

load more comments