this post was submitted on 27 Feb 2024
26 points (96.4% liked)

AI Generated Images

7193 readers
396 users here now

Community for AI image generation. Any models are allowed. Creativity is valuable! It is recommended to post the model used for reference, but not a rule.

No explicit violence, gore, or nudity.

This is not a NSFW community although exceptions are sometimes made. Any NSFW posts must be marked as NSFW and may be removed at any moderator's discretion. Any suggestive imagery may be removed at any time.

Refer to https://lemmynsfw.com/ for any NSFW imagery.

No misconduct: Harassment, Abuse or assault, Bullying, Illegal activity, Discrimination, Racism, Trolling, Bigotry.

AI Generated Videos are allowed under the same rules. Photosensitivity warning required for any flashing videos.

To embed images type:

“![](put image url in here)”

Follow all sh.itjust.works rules.


Community Challenge Past Entries

Related communities:

founded 1 year ago
MODERATORS
 

In Bing Image Creator, DAL-E 3

Prompt: a Hawaiian shirt with subtle designs celebrating Hot Wheels 50th anniversary

Changing the subtle to hidden didn't get much better results.

you are viewing a single comment's thread
view the rest of the comments
[–] j4k3@lemmy.world 6 points 9 months ago (1 children)

I never play with proprietary AI like this, so I don't know this model, but I have many image diffusion models I run offline.

I don't know how experienced you are with prompting, but making a few assumptions...

Shift how you think about prompting for an image. Think of the prompt like you are addressing an entity like an roleplaying with a LLM. If you really get to know a LLM with roleplaying, you'll learn that the model is trying to satisfy the fundamental needs of every character involved including the one you play. It is doing all of this within the limits it has assumed (or have been described) for each character.

Image diffusion works in much the same way. The prompt is talking to something akin to a roleplaying entity that can only respond by generating an image, but it is still a dynamic and emotional entity. When you say it "does not understand the word subtle" that is likely not the case. There is a configuration setting (that may or may not be available to you) that tells the model how strongly to follow the prompt. If you try and make this too strong of a setting, you'll get terrible results. If you explore this in detail you may notice these responses are like a vindictive little child retaliating from being punished unfairly. You must allow the entity their own sense of creative collaboration for their own satisfaction.

If you really want subtlety, the key is to describe what you really want with more passion and flair. There is a major emotional element to this and it really requires the user exploring their own inner emotions on never before explored levels of thought needed to communicate their ideas with more verbosity.

I only learned this because I connected a text roleplaying model to an image diffusion model in software someone else wrote and I modified. I monitored how the images were generated and noticed it was simply long text. I started observing the effect in detail and that lead me here.

You can write a few keywords into an image prompt and it will try and create an emotional story to fill in the gaps, but you need to describe how the image makes you feel and why if you really want specificity in detail. This is hard to do IMO and it takes a lot of practice along with a willingness to explore things like why you like a "subtle Hawaiian shirt" or what subtle really means in less subjective terms.

[–] Usernameblankface@lemmy.world 3 points 9 months ago* (last edited 9 months ago)

Hmm. Using Bing, I definitely do not have access to settings, I can only change my input to be longer and more descriptive of my idea.

In Bing Image Creator, DAL-E 3

Prompt: a Hawaiian shirt with a normal palm trees, bright colored flower blooms, and white background design. Artfully and playfully hidden in the white spaces and among the loud colors are many subtle, small, hidden hotwheels logos, designed to catch the eye on closer observation, but hidden at first glance.

It seems to have taken Hot Wheels as meaning the cars rather than the logos. But it's a lot better!