Technology

72769 readers

1367 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

Has Generative AI Already Peaked? - Computerphile (www.youtube.com)

submitted 1 year ago by barsoap@lemm.ee to c/technology@lemmy.world

29 comments fedilink hide all child comments

A new paper suggests diminishing returns from larger and larger generative AI models. Dr Mike Pound discusses.

The Paper (No "Zero-Shot" Without Exponential Data): https://arxiv.org/abs/2404.04125

you are viewing a single comment's thread
view the rest of the comments

[–] bamboo@lemm.ee 23 points 1 year ago* (last edited 1 year ago) (3 children)

I think it’s incredibly naïve to think that because we’ve hit a boundary on one particular aspect of LLMs that the technology has peaked as a whole. There are lots of ways to improve LLMs that aren’t just increasing the parameter size, for example there’s been an uptick in smaller models that are optimized to run on client devices without large GPUs. There is probably a future where we have small 3-7B models that are competitive with today’s best 70B models, but can run in real time on any smartphone. We’ll have larger context windows, allowing LLMs to work on larger problems. And we’ll have better techniques for getting high quality information out of LLMs, there are already adversarial methods where two LLMs hold a debate on a subject that have proven more accurate and comprehensive data is possible. They’ll also continue to be embedded into different places in software that make them more useful, not just like a chatbot that lives in its own world.

[–] barsoap@lemm.ee 35 points 1 year ago

There are lots of ways to improve LLMs that aren’t just increasing the parameter size

The paper isn't about parameter size but the need for exponentially more training data to get a mere linear increase in output performance.

[–] magic_lobster_party@kbin.run 8 points 1 year ago

Improvements are made all the time. You can’t feed a very large SVM the same data as transformer networks and expect it to perform the same. Transformers are used because they can more easily learn complicated patterns with less data.

I think I’ve read somewhere that neural networks with only one hidden layer can theoretically predict anything (if the hidden layer is large enough), but an incredible amount of data is required for it to do so, so it’s not practical.

Over time other models will be discovered that can make better use of the training data.

[–] Murvel@lemm.ee 7 points 1 year ago* (last edited 1 year ago) (2 children)

What you mentioned is assumed video and paper in question.

The main argument being that no matter our computational techniques, the diminishing returns in predictive precision is reached far sooner than we achieve general intelligence.

[–] boyi@lemmy.sdf.org 2 points 1 year ago* (last edited 1 year ago) (2 children)

no matter our computational techniques, the diminishing returns in predictive precision is reached far sooner than we achieve general intelligence

That's very bold presumption. How can they be so sure of this, that any future models can't tackle the issue? have they got proof or something.

[–] Murvel@lemm.ee 2 points 1 year ago

No, they just calculate with increased size of the training roster.. it's not that complicated. Which is a fair presumption as that is how we've increased the predictive precision so far.

[–] technocrit@lemmy.dbzer0.com 1 points 1 year ago* (last edited 1 year ago)

It seems far more bold to presume that general intelligence will be created any time soon when current machine learning is nowhere close.

[–] Womble@lemmy.world 2 points 1 year ago (1 children)

No the argument is current techniques give logarithmic returns in data size, which is bad. But it said nothing about other potential techniques or made any suggestion that this was a general result.

[–] Murvel@lemm.ee 3 points 1 year ago

Well obviously they cannot rule out techniques no one has though of but likewise they obviously accounted for what they deemed to be within the realm of possibility