this post was submitted on 24 Mar 2024
1229 points (97.5% liked)

Memes

51036 readers
1425 users here now

Rules:

  1. Be civil and nice.
  2. Try not to excessively repost, as a rule of thumb, wait at least 2 months to do it if you have to.

founded 6 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] kernelle@0d.gs 58 points 1 year ago (23 children)

"Publicly available data" - I wonder if that includes Disney's catalogue? Or Nintendo's IP? I think they are veeery selective about their "Publicly available data", it also implies the only requirement for such training data is that it is publicly available, which almost every piece of media ever? How an AI model isn't public domain by default baffles me.

[–] redcalcium@lemmy.institute 4 points 1 year ago* (last edited 1 year ago) (1 children)

There is a rumor that OpenAI downloaded the entirety of LibGen to train their AI models. No definite proof yet, but it seems very likely.

https://torrentfreak.com/authors-accuse-openai-of-using-pirate-sites-to-train-chatgpt-230630/

[–] 100_kg_90_de_belin@feddit.it 3 points 1 year ago

"It just like me fr fr" (cit.)

load more comments (21 replies)