this post was submitted on 22 Feb 2024
488 points (96.2% liked)

Technology

59534 readers
3195 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Google apologizes for ‘missing the mark’ after Gemini generated racially diverse Nazis::Google says it’s aware of historically inaccurate results for its Gemini AI image generator, following criticism that it depicted historically white groups as people of color.

you are viewing a single comment's thread
view the rest of the comments
[–] BurningnnTree@lemmy.one 38 points 9 months ago* (last edited 9 months ago) (16 children)

No matter what Google does, people are going to come up with gotcha scenarios to complain about. People need to accept the fact that if you don't specify what race you want, then the output might not contain the race you want. This seems like such a silly thing to be mad about.

[–] UnderpantsWeevil@lemmy.world 2 points 9 months ago (11 children)

No matter what Google does, people are going to come up with gotcha scenarios to complain about.

American using Gemini: "Please produce images of the KKK, historically accurate Santa's Workshop Elves, and the board room of a 1950s auto company"

Also Americans: "AH!! AH!!!!! Minorities and Women!!!!!!! AAAAAHHH!!!!"

I mean, idk, man. Why do you need AI to generate an image of George Washington when you have thousands of images of him already at your disposal?

[–] FinishingDutch@lemmy.world 20 points 9 months ago (10 children)

Because sometimes you want an image of George Washington, riding a dinosaur, while eating a cheeseburger, in Paris.

Which you actually can’t do on Bing anyway, since it ‘content warning’ stops you from generating anything with George Washington…

Ask it for a Founding Father though, it’ll even hand him a gat!

https://lemmy.world/pictrs/image/dab26e07-34c8-422e-944f-83d7f719ea2e.jpeg

[–] raptir@lemdro.id 10 points 9 months ago (2 children)

He's not even eating the cheeseburger, crap AI.

[–] chakan2@lemmy.world 3 points 9 months ago (1 children)

An it's not a beyond burger... it's promoting the genocide of cattle.

[–] FinishingDutch@lemmy.world 8 points 9 months ago (1 children)

Here’s one that was made, just for you, with specifically a VEGAN cheeseburger in the prompt :D

https://lemmy.world/pictrs/image/075a3e02-f76d-4541-83cb-d777f9befbc6.jpeg

[–] chakan2@lemmy.world 1 points 9 months ago
[–] FinishingDutch@lemmy.world 2 points 9 months ago (1 children)

Funnily enough, he’s not eating one in the other three images either. He’s holding an M16 in one, with the dinosaur partially as a hamburger (?). In the other two he’s merely holding the burger.

I assume if I change the word order around a bit, I could get him to enjoy that burger :D

[–] VoterFrog@lemmy.world 5 points 9 months ago (1 children)

This is the thing. There's an incredible number of inaccuracies in the picture, several of which flat out ignore the request in the prompt, and we laugh it off. But the AI makes his skin a little bit darker? Write the Washington Post! Historical accuracy! Outrage!

[–] FinishingDutch@lemmy.world 1 points 9 months ago (1 children)

Well, the tech is of course still young. And there's a distinct difference between:

A) User error: a prompt that isn't as good as it can be, with the user understanding for example the 'order of operations' that the AI model likes to work in.

B) The tech flubbing things because it's new and constantly in development

C) The owners behind the tech injecting their own modifiers into the AI model in order to get a more diverse result.

For example, in this case I understand the issue: the original prompt was 'image of an American Founding Father riding a dinosaur, while eating a cheeseburger, in Paris.' Doing it in one long sentence with several comma's makes it harder for the AI to pin down the 'main theme' from my experience. Basically, it first thinks 'George on a dinosaur' with the burger and Paris as afterthoughts. But if you change the prompt around a bit to 'An American Founding Father is eating a cheeseburger. He is riding on a dinosaur. In the background of the image, we see Paris, France.', you end up with the correct result:

Basically the same input, but by simply swapping around the wording it got the correct result. Other 'inaccuracies' are of course to be expected, since I didn't really specify anything for the AI to go of. I didn't give it a timeframe for one, so it wouldn't 'know' not to have the Eiffel Tower and a modern handgun in it. Or that that flag would be completely wrong.

The problem is with C) where you simply have no say in the modifiers that they inject into any prompt you send. Especially when the companies state that they are doing it on purpose so the AI will offer a more diverse result in general. You can write the best, most descriptive prompt and there will still be an unexpected outcome if it injects their modifiers in the right place of your prompt. That's the issue.

[–] VoterFrog@lemmy.world 2 points 9 months ago

C is just a work around for B and the fact that the technology has no way to identify and overcome harmful biases in its data set and model. This kind of behind the scenes prompt engineering isn't even unique to diversifying image output, either. It's a necessity to creating a product that is usable by the general consumer, at least until the technology evolves enough that it can incorporate those lessons directly into the model.

And so my point is, there's a boatload of problems that stem from the fact that this is early technology and the solutions to those problems haven't been fully developed yet. But while we are rightfully not upset that the system doesn't understand that lettuce doesn't go on the bottom of a burger, we're for some reason wildly upset that it tries to give our fantasy quasi-historical figures darker skin.

load more comments (7 replies)
load more comments (7 replies)
load more comments (11 replies)