this post was submitted on 08 Jan 2024
334 points (96.1% liked)
Technology
59605 readers
3415 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm not a huge fan of Microsoft or even OpenAI by any means, but all these lawsuits just seem so... lazy and greedy?
It isn't like ChatGPT is just spewing out the entirety of their works in a single chat. In that context, I fail to see how seeing snippets of said work returned in a Google summary is any different than ChatGPT or any other LLM doing the same.
Should OpenAI and other LLM creators use ethically sourced data in the future? Absolutely. They should've been doing so all along. But to me, these rich chumps like George R. R. Martin complaining that they felt their data was stolen without their knowledge and profited off of just feels a little ironic.
Welcome to the rest of the 6+ billion people on the Internet who've been spied on, data mined, and profited off of by large corps for the last two decades. Where's my god damn check? Maybe regulators should've put tougher laws and regulations in place long ago to protect all of us against this sort of shit, not just businesses and wealthy folk able to afford launching civil suits and shakey grounds. It's not like deep learning models are anything new.
Edit:
Already seeing people come in to defend these suits. I just see it like this: AI is a tool, much like a computer or a pencil are tools. You can use a computer to copyright infringe all day, just like a pencil can. To me, an AI is only going to be plagiarizing or infringing if you tell it to. How often does AI plagiarize without a user purposefully trying to get it to do so? That's a genuine question.
Regardless, the cat's out of the bag. Multiple LLMs are already out in the wild and more variations are made each week, and there's no way in hell they're all going to be reigned in. I'd rather AI not exist, personally, as I don't see protections coming for normal workers over the next decade or two against further evolutions of the technology. But, regardless, good luck to these companies fighting the new Pirate Bay-esque legal wars for the next couple of decades.
If I want to be able to argue that having any copyleft stuff in the training dataset makes all the output copyleft -- and I do -- then I necessarily have to also side with the rich chumps as a matter of consistency. It's not ideal, but it can't be helped. ¯\_(ツ)_/¯
In your mind are the publishers the rich chumps, or Microsoft?
For copyleft to work, copyright needs to be strong.
I was just repeating the language the parent commenter used (probably should've quoted it in retrospect). In this case, "rich chumps" are George R.R. Martin and other authors suing Microsoft.