this post was submitted on 09 Jan 2024
7 points (64.0% liked)

AI Generated Images

7182 readers
120 users here now

Community for AI image generation. Any models are allowed. Creativity is valuable! It is recommended to post the model used for reference, but not a rule.

No explicit violence, gore, or nudity.

This is not a NSFW community although exceptions are sometimes made. Any NSFW posts must be marked as NSFW and may be removed at any moderator's discretion. Any suggestive imagery may be removed at any time.

Refer to https://lemmynsfw.com/ for any NSFW imagery.

No misconduct: Harassment, Abuse or assault, Bullying, Illegal activity, Discrimination, Racism, Trolling, Bigotry.

AI Generated Videos are allowed under the same rules. Photosensitivity warning required for any flashing videos.

To embed images type:

“![](put image url in here)”

Follow all sh.itjust.works rules.


Community Challenge Past Entries

Related communities:

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] mindbleach@sh.itjust.works 4 points 10 months ago

A network containing much of its training set is broken.

Deep networks do find heuristics. That's what all the layers are for. That's why it takes abundant training, instead of abundant storage. We already had computers that can give you the next word of a Stephen King novel... they're called e-books.

Tune AI just right and it'll know that Stephen King writes horror, in English - having distilled both concepts from raw data. Grammar is a demonstration of novel output. The fact these things can conjugate a verb (or count fingers on a hand) is deep magic. There's hints of them being able to do math, which you'd think is trivial for a supercomputer, except it'd have to be doing math roughly the same way you do math.

Anyway: generative LLMs should ideally contain about as much original data per-subject as its Wikipedia article. Key names, general premise, relevant dates, and then enough labels to cobble together some kind of bootleg.

The trouble comes from people making question-answering LLMs, which for obvious reasons are supposed to contain all the details necessary to pass a pop quiz. This is fundamentally at-odds with making shit up. (It's also not very good at answering questions, so they should really focus on training a network that can evaluate text instead of training a network on that text.)

Image AI seems entirely focused on making shit up, which makes the blatant overfitting in MidJourney a head-scratcher. Knowing what Darth Vader looks like is a non-event. Everyone knows what Darth Vader looks like, and everyone knows he correlates strongly with laser-swords. Even being able to draw vaguely cinematic frames is whatever, because it turns out a lot of things look like a lot of other things. But some of those Dune examples are trying to pass a pop quiz. That's just incorrect behavior.

The draw-anything machine should absolutely be able to draw frames that look like they're from Denis Villeneuve's adaptation. Key words: look like. Floppy hair, muted colors, recognizable specific actors, sure. Probably even matching the framing of one shot or other, because again, movies look like movies. But if any specific frame is simply being reproduced, the process has gone wrong. That's simply not what it's for.