this post was submitted on 14 Jan 2024
837 points (99.2% liked)
Technology
59605 readers
3415 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Thank you for the clarification.
This is indeed a complicated subject, and thank you again for your insight. These are very good example cases, because Google's searchable book database is exactly the same as the training databases LLM's use to develop their transform nodes.
The difference between the Authors Guild cases and this one, as I see it, is that Google and HathiTrust are acting to preserve information and art for future generations - there is an inherent benefit to society front and centre with their goals. With LLM's, the goal is to develop a commercial product. Yes, people can use it for free (right now) but ultimately they expect to sell access and profit from it. Also, no one else gets access to their training database, it is kept as some sort of trade secret.
Yay!
I wouldn't want to restrict or gatekeep access to art for genuine fair purpose uses. I agree with the Authors Guild rulings in those circumstances, I just disagree that LLM's are a similar enough circumstance that LLM's deserve the same exemption with how they're developed.
I agree. Certainly, not copyright law as it exists right now, and even then there are so many aspects of the use of AI that fall well oustide the scope of copyright law.
Ultimately, my gripe is that a commercial business has used copyrighted work to develop a product without paying the rightsholders. Their product is their own unique creation, but the copyrighted work their product learned from was not. The training database they've used is not "research" because it is not scholarly; even if it were research, it is highly commercial in nature and as such does not warrant a fair use exemption.