this post was submitted on 29 Jan 2025
151 points (92.7% liked)
Technology
61227 readers
4363 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Distilling OpenAI and Llama models probably also helped quite a bit
Although I must admit, that the architectural changes are pretty cool
but I have to add, that I've just started reading into the topic a few weeks ago and don't really have any real practical experience, besides checking out some huggingface docs I got linked yesterday and stupid me hasn't thought about looking there...
So everything I say is probably bullshit o:-)
Sure it made the training process faster, but this still takes a fraction of the energy to generate a single output compared to other LLMs like ChatGPT or Llama. Plus it's open source. You can't discredit a technological advancement for building upon previous advancement, especially when doing so with transparency.
As I said, the architectural changes are quite cool
As far as I've understood it mostly comes down to splitting it up into multiple expert systems, so you don't need to activate the complete system with every request
But I've only scratched the surface...
Also, open source... The weights are made publicly available.
None of the training data or systems
Edit: regarding "open source":
Also Meta's Llama is on huggingface, just like deepseek. I still wouldn't talk about transparency here