this post was submitted on 27 Feb 2024
26 points (96.4% liked)

AI Generated Images

7193 readers
396 users here now

Community for AI image generation. Any models are allowed. Creativity is valuable! It is recommended to post the model used for reference, but not a rule.

No explicit violence, gore, or nudity.

This is not a NSFW community although exceptions are sometimes made. Any NSFW posts must be marked as NSFW and may be removed at any moderator's discretion. Any suggestive imagery may be removed at any time.

Refer to https://lemmynsfw.com/ for any NSFW imagery.

No misconduct: Harassment, Abuse or assault, Bullying, Illegal activity, Discrimination, Racism, Trolling, Bigotry.

AI Generated Videos are allowed under the same rules. Photosensitivity warning required for any flashing videos.

To embed images type:

“![](put image url in here)”

Follow all sh.itjust.works rules.


Community Challenge Past Entries

Related communities:

founded 1 year ago
MODERATORS
 

In Bing Image Creator, DAL-E 3

Prompt: a Hawaiian shirt with subtle designs celebrating Hot Wheels 50th anniversary

Changing the subtle to hidden didn't get much better results.

top 10 comments
sorted by: hot top controversial new old
[–] j4k3@lemmy.world 6 points 9 months ago (1 children)

I never play with proprietary AI like this, so I don't know this model, but I have many image diffusion models I run offline.

I don't know how experienced you are with prompting, but making a few assumptions...

Shift how you think about prompting for an image. Think of the prompt like you are addressing an entity like an roleplaying with a LLM. If you really get to know a LLM with roleplaying, you'll learn that the model is trying to satisfy the fundamental needs of every character involved including the one you play. It is doing all of this within the limits it has assumed (or have been described) for each character.

Image diffusion works in much the same way. The prompt is talking to something akin to a roleplaying entity that can only respond by generating an image, but it is still a dynamic and emotional entity. When you say it "does not understand the word subtle" that is likely not the case. There is a configuration setting (that may or may not be available to you) that tells the model how strongly to follow the prompt. If you try and make this too strong of a setting, you'll get terrible results. If you explore this in detail you may notice these responses are like a vindictive little child retaliating from being punished unfairly. You must allow the entity their own sense of creative collaboration for their own satisfaction.

If you really want subtlety, the key is to describe what you really want with more passion and flair. There is a major emotional element to this and it really requires the user exploring their own inner emotions on never before explored levels of thought needed to communicate their ideas with more verbosity.

I only learned this because I connected a text roleplaying model to an image diffusion model in software someone else wrote and I modified. I monitored how the images were generated and noticed it was simply long text. I started observing the effect in detail and that lead me here.

You can write a few keywords into an image prompt and it will try and create an emotional story to fill in the gaps, but you need to describe how the image makes you feel and why if you really want specificity in detail. This is hard to do IMO and it takes a lot of practice along with a willingness to explore things like why you like a "subtle Hawaiian shirt" or what subtle really means in less subjective terms.

[–] Usernameblankface@lemmy.world 3 points 9 months ago* (last edited 9 months ago)

Hmm. Using Bing, I definitely do not have access to settings, I can only change my input to be longer and more descriptive of my idea.

In Bing Image Creator, DAL-E 3

Prompt: a Hawaiian shirt with a normal palm trees, bright colored flower blooms, and white background design. Artfully and playfully hidden in the white spaces and among the loud colors are many subtle, small, hidden hotwheels logos, designed to catch the eye on closer observation, but hidden at first glance.

It seems to have taken Hot Wheels as meaning the cars rather than the logos. But it's a lot better!

[–] IvanOverdrive@lemm.ee 4 points 9 months ago (1 children)

What exactly is a subtle Hawaiian shirt design? Muted colors? Lots of negative space? Be very specific in what you want. I don't know what "subtle" is supposed to mean. An LLM sure isn't going to.

[–] Usernameblankface@lemmy.world 2 points 9 months ago (1 children)

Ok, I meant a regular Hawaiian shirt with subtle hotwheels logos... Like hidden mikeys or something

[–] IvanOverdrive@lemm.ee 1 points 9 months ago

Try this: A Hawaiian shirt with hidden Hot Wheels logos incorporated into the leaves.

[–] Huckledebuck@sh.itjust.works 3 points 9 months ago

I know a lot of grown ass people that also don't know what subtle means.

[–] fruitycoder@sh.itjust.works 2 points 9 months ago

Guy Fieri levels of subtle

[–] altima_neo@lemmy.zip 1 points 9 months ago* (last edited 9 months ago) (1 children)

I agree with the other guy. Ive been using Stable Diffusion for about a year now, and been playing with Bing for a few months.

When it comes to describing stuff, words like "subtle" dont really mean much. You gotta be real deliberate with your description and almost treat it like its dumb. Simple descriptions, but detailed with quantifiable words. You can sprinkle a few qualitative words here and there, but dont rely on them to be the main driver of the composition. They can help make a blah image into a much nicer one, but they dont usually make as huge of a difference as more descriptive words.

Now, trying to see if I can get bing to do anything with the Hot Wheels logo makes me think it may be a bit overtrained, because it sure isnt budging!

[–] Usernameblankface@lemmy.world 2 points 9 months ago (1 children)

Ah, of course, quantifiable. It's a computer, it has to have a quantifiable description to work with.

Yeah, Hot Wheels has a LOT of images online, and they're never subtle or hiding anything.

[–] altima_neo@lemmy.zip 2 points 8 months ago

Yeah. Its too bad theres not as much control as you would have like with Stable Diffusion. Its a lot easier to tweak it to try and hide the logo within the image, for example.