Technology

76171 readers

3784 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

186

Google to pause Gemini AI image generation after refusing to show White people. (www.foxbusiness.com)

submitted 2 years ago by L4s@lemmy.world to c/technology@lemmy.world

35 comments fedilink hide all child comments

Google to pause Gemini AI image generation after refusing to show White people.::Google will pause the image generation feature of its artificial intelligence model, Gemini, after the model refused to show images of White people when prompted.

you are viewing a single comment's thread
view the rest of the comments

[–] j4k3@lemmy.world 37 points 2 years ago (3 children)

So what. It means they overtrained, deployed, and had to choose between reverting to a model with known issues or training a new model. They probably tried a temporary fix with a LoRA and it failed so they have to wait on the next big version to finish training and those can take weeks even on massive data center class hardware.

People don't seem to have any fundamental understanding of AI here. It is all static tensor math. There is no persistence or learning inside the model. Any illusion of persistence is due to the loader code that turns your text into math tokens. That is just standard code.

There is no fundamental difference between an offline AI and the proprietary like Gemini. One loader code is just data mining while the other is not. Training has a sweet spot. If too much John Oliver is added, everything will generate as John Oliver, like absolutely everything.

[–] Virulent@reddthat.com 45 points 2 years ago (1 children)

No, the problem is that they filter prompts and inject new parameters into prompts specifically to avoid creating white subjects. It's so bad that, when asked to generate a chessboard, Gemini would only make one with black pieces.

[–] j4k3@lemmy.world 5 points 2 years ago

That would not have caused them to go offline. Modifying a hash table takes 0 minutes of down time. Likewise a LoRA layer takes no down time. The only reason to go completely offline is because they need to filter the base dataset and retrain from scratch. It means the error is so intertwined across so many neural layers that a simple extra filter layer is unable to address it.

The neural network is like a giant multi dimensional cloud in 3d but where there are more than 3 dimensions. All the stuff in the cloud are vector relationships. If there is some easily traversed path where neural connections are gravitating towards a simple modification like slice across that cloud can modify that easily traversed path ever so slightly to make it less easily traversed. This is something like a LoRA that can be tacked onto the model's math.

However, if the undesirable behavior is due to something like all roads leading to the center of a giant city metropolis, no slice across that cloud can subtly alter all of the neural paths without impacting adjacent data. It is all approximated floating point math where every concept and generation parameter is inner related. Things like bunny rabbit and Playboy playmate are stored in the same tables. If you try and make all bunny rabbits black, you are also altering all playmates. It is simply because there is an minor relationship between these concepts and therefore they share a vector space inside some tensor tables. There is a very big difference between how the initial table values are created across all layers and how a modified layer works. When things go really bad, the only option is to retrain the whole thing from scratch.

load more comments (1 replies)