this post was submitted on 21 May 2024
153 points (98.7% liked)

Not The Onion

12368 readers
316 users here now

Welcome

We're not The Onion! Not affiliated with them in any way! Not operated by them in any way! All the news here is real!

The Rules

Posts must be:

  1. Links to news stories from...
  2. ...credible sources, with...
  3. ...their original headlines, that...
  4. ...would make people who see the headline think, “That has got to be a story from The Onion, America’s Finest News Source.”

Comments must abide by the server rules for Lemmy.world and generally abstain from trollish, bigoted, or otherwise disruptive behavior that makes this community less fun for everyone.

And that’s basically it!

founded 1 year ago
MODERATORS
 

Of the 100 results, only three of them are common enough to be used in everyday conversations; everything else consisted of words and expressions used specifically in the contexts of either gambling or pornography. The longest token, lasting 10.5 Chinese characters, literally means “_free Japanese porn video to watch.” Oops. [Tokens are part of text ChatGPT combine to generate replies.]

Users have also found that these tokens can be used to break the LLM, either getting it to spew out completely unrelated answers or, in rare cases, to generate answers that are not allowed under OpenAI’s safety standards.

In his tests, which Geng chooses not to share with the public, he says he can see GPT-4o generating the answers line by line. But when it almost reaches the end, another safety mechanism kicks in, detects unsafe content, and blocks it from being shown to the user.

“The robustness of visual input is worse than text input in multimodal models,” says Geng, whose research focus is on visual models. Filtering a text data set is relatively easy, but filtering visual elements will be even harder. “The same issue with these Chinese spam tokens could become bigger with visual tokens,” he says.

you are viewing a single comment's thread
view the rest of the comments
[–] Luisp@lemmy.dbzer0.com 24 points 6 months ago

Horny gpt 2 strikes again