Technology

73379 readers

4502 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

Study shows AI image-generators being trained on explicit photos of children (www.independent.co.uk)

submitted 2 years ago by L4s@lemmy.world to c/technology@lemmy.world

4 comments fedilink hide all child comments

Study shows AI image-generators being trained on explicit photos of children::Hidden inside the foundation of popular artificial intelligence image-generators are thousands of images of child sexual abuse, according to a new report that urges companies to take action to address a harmful flaw in the technology they built

all 5 comments

sorted by: hot top controversial new old

[–] uriel238@lemmy.blahaj.zone 13 points 2 years ago

All of our protect the children legislation is typically about inhibiting technology that might be used to cause harm and not about assuring children have access to places of safety, adequate food and comfort, time with and access to parents, freedom to live and play.

Y'know, all those things that help make kids resilient to bullies and challenges of growing up. Once again, we leave our kids cold and hungry in poverty while blaming the next new thing for their misery.

So I call shenanigans. Again.

[–] SSUPII@sopuli.xyz 2 points 2 years ago

That is bound to happen if what has been used is images from the open web. What's the news

[–] autotldr@lemmings.world 1 points 2 years ago

This is the best summary I could come up with:

Hidden inside the foundation of popular artificial intelligence image-generators are thousands of images of child sexual abuse, according to a new report that urges companies to take action to address a harmful flaw in the technology they built.

Those same images have made it easier for AI systems to produce realistic and explicit imagery of fake children as well as transform social media photos of fully clothed real teens into nudes, much to the alarm of schools and law enforcement around the world.

Until recently, anti-abuse researchers thought the only way that some unchecked AI tools produced abusive imagery of children was by essentially combining what they've learned from two separate buckets of online images — adult pornography and benign photos of kids.

It’s not an easy problem to fix, and traces back to many generative AI projects being “effectively rushed to market” and made widely accessible because the field is so competitive, said Stanford Internet Observatory's chief technologist David Thiel, who authored the report.

LAION was the brainchild of a German researcher and teacher, Christoph Schuhmann, who told the AP earlier this year that part of the reason to make such a huge visual database publicly accessible was to ensure that the future of AI development isn't controlled by a handful of powerful companies.

Google built its text-to-image Imagen model based on a LAION dataset but decided against making it public in 2022 after an audit of the database “uncovered a wide range of inappropriate content including pornographic imagery, racist slurs, and harmful social stereotypes.”

The original article contains 1,221 words, the summary contains 256 words. Saved 79%. I'm a bot and I'm open source!

[–] cyd@lemmy.world -1 points 2 years ago

3200 images is 0.001% of the dataset in question, obviously sucked in by mistake. The problematic images ought to be removed from the dataset, but this does not "contaminate" models trained on the dataset in any plausible way.