Technology

84733 readers

4037 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

361

Generative AI hype is ending – and now the technology might actually become useful (theconversation.com)

submitted 2 years ago by Ilandar@aussie.zone to c/technology@lemmy.world

106 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Wirlocke@lemmy.blahaj.zone 4 points 2 years ago* (last edited 2 years ago) (2 children)

Theoretically we could slow down training and coast on fine-tuning existing models. Once the AI's trained they don't take that much energy to run.

Everyone was racing towards "bigger is better" because it worked up to GPT4, but word on the street is that raw training is giving diminishing returns so the massive spending on compute is just a waste now.

[–] ZILtoid1991@lemmy.world 2 points 2 years ago

Issue is, we're reaching the limits of what GPT technologies can do, so we have to retrain them for the new ones, and currently available data have been already poisoned by AI generated garbage, which will make the adaptation of new technologies harder.

[–] areyouevenreal@lemm.ee 2 points 2 years ago

It's a bit more complicated than that.

New models are sometimes targeting architecture improvements instead of pure size increases. Any truly new model still needs training time, it's just that the training time isn't going up as much as it used to. This means that open weights and open source models can start to catch up to large proprietary models like ChatGPT.

From my understanding GPT 4 is still a huge model and the best performing. The other models are starting to get close though, and can already exceed GPT 3.5 Turbo which was the previous standard to beat and is still what a lot of free chatbots are using. Some of these models are still absolutely huge though, even if not quite as big as GPT 4. For example Goliath is 120 billion parameters. Still pretty chonky and intensive to run even if it's not quite GPT 4 sized. Not that anyone actually knows how big GPT 4 is. Word on the street is it's a MoE model like Mixtral which run faster than a normal model for their size, but again no one outside Open AI actually can say with certainty.

You generally find that Open AI models are larger and slower. Wheras the other models focus more on giving the best performance at a given size as training and using huge models is much more demanding. So far the larger Open AI models have done better, but this could change as open source models see a faster improvement in the techniques they use. You could say open weights models rely on cunning architectures and fine tuning versus Open AI uses brute strength.