this post was submitted on 06 Sep 2024
1726 points (90.1% liked)
Technology
61227 readers
4347 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
We don't know exactly how they source their data (and that is definitely shady), but if I can gain access to a movie in a legal way, I don't see why I would not be able to gather statistics from said movie, including running a speech to text model to caption it, then make statistics of how many times a few words were used, and followed by which ones. This is an oversimplified explanation of what a LLM does, but it's the fairest I can come up, and it would be legal to do so. The models are always orders of magnitude smaller than the data they are trained on.
That said, I don't imply that I'm happy with the state of high tech companies, the AI hype, the energy consumption, or the impact on the humble people. But I've put a lot of thought into this (and learning about machine learning for real), and I think this is not a ML problem, but a problem in the economic, legal and political system. AI hype is just a symptom.