this post was submitted on 23 Mar 2026
610 points (99.4% liked)

Technology

83027 readers
3399 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] errer@lemmy.world 2 points 1 day ago (5 children)

Wikipedia probably wants to sell access to LLMs to train. It’s only valuable if Wikipedia remains a high-quality, slop-free source.

I think even AI zealots think there should be silos of content to train from that are fully human generated. Training slop on slop makes the slop even worse.

[–] Grimy@lemmy.world 18 points 1 day ago (2 children)

Sell licenses of what? It's already all in the creative commons iirc.

[–] Zagorath@quokk.au 2 points 1 day ago (1 children)

The content is CC licensed, but they are trying to block AI scraping because it overloads their servers. They have a paid API that uses a lot less compute for both Wikipedia and the AI, as well as being a revenue source for Wikipedia.

[–] ricecake@sh.itjust.works 1 points 5 hours ago

Yes, but...

https://en.wikipedia.org/wiki/Wikipedia%3ADatabase_download

That's because viewing the page uses server resources, as done API access. If you want the data you can download the database directly.

load more comments (2 replies)