this post was submitted on 06 Sep 2024
1726 points (90.1% liked)
Technology
61227 readers
4347 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I have no personal interest in the matter, tbh. But I want people to actually understand what they're advocating for and what the downstream effects would inevitably be. Model training is not inherently infringing activity under current IP law. It just isn't. Neither the law, legislative or judicial, nor the actual engineering and operations of these current models support at all a finding of infringement. Effectively, this means that new legislation needs to be made to handle the issue. Most are effectively advocating for an entirely new IP right in the form of a "right to learn from" which further assetizes ideas and intangibles such that we get further shuffled into endstage capitalism, which most advocates are also presumably against.
I'm pretty sure most people are just mad that this is basically "rules for thee but not for me", why should a company be free to pirate but I can't? Case in point is the internet archive losing their case against a publisher. That's the crux of the issue.
I get that that's how it feels given how it's being reported, but the reality is that due to the way this sort of ML works, what internet archive does and what an arbitrary GPT does are completely different, with the former being an explicit and straightforward copy relying on Fair Use defense and the latter being the industrialized version of intensive note taking into a notebook full of such notes while reading a book. That the outputs of such models are totally devoid of IP protections actually makes a pretty big difference imo in their usefulness to the entities we're most concerned about, but that certainly doesn't address the economic dilemma of putting an entire sector of labor at risk in narrow areas.