this post was submitted on 08 Jan 2024
405 points (96.1% liked)
Technology
59674 readers
3004 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The problem is not that it's regurgitating. The problem is that it was trained on NYT articles and other data in violation of copyright law. Regurgitation is just evidence of that.
I've seen and heard your argument made before, not just for LLM's but also for text-to-image programs. My counterpoint is that humans learn in a very similar way to these programs, by taking stuff we've seen/read and developing a certain style inspired by those things. They also don't just recite texts from memory, instead creating new ones based on probabilities of certain words and phrases occuring in the parts of their training data related to the prompt. In a way too simplified but accurate enough comparison, saying these programs violate copyright law is like saying every cosmic horror writer is plagiarising Lovecraft, or that every surrealist painter is copying Dali.
Machines aren’t people and it’s fine and reasonable to have different standards for each.
But is it reasonable to have different standards for someone creating a picture with a paintbrush as opposed to someone creating the same picture with a machine learning model?
Yes, given that one is creating art and the other is typing words into the plagiarism machine.
This is called assuming the consequent. Either you're not trying to make a persuasive argument or you're doing it very, very badly.