Technology

81907 readers

5040 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

Best text to image generator (lemmy.world)

submitted 2 years ago by Billd111@lemmy.world to c/technology@lemmy.world

4 comments fedilink hide all child comments

I have used several different generators. What they all seem to have in common is that they don't always display what I am asking for. Example: if I am looking for a person in jeans and t-shirt, I will get images of a person wear things totally different clothing and it isn't consistent. Another example is if I want a full body picture, that command seems to be ignored giving just waist up or just below the waist. Same goes if I ask for side views or back views. Sometimes they work. Sometimes they don't. More often they don't. I have also seen that none of the negative requests seem to actually work. If I ask for pictures of people and don't want them using cell phones or no tattoos, like magic they have cell phones. Some have tattoos. I have noticed this in every single generator I have used. Am I asking for things the wrong way or is the AI doing whatever it wants and not paying attention to my actual request?

Thanks

top 4 comments

sorted by: hot top controversial new old

[–] EdgeRunner@lemmy.dbzer0.com 1 points 2 years ago (1 children)

Its time to promote, https://lemmy.dbzer0.com/c/stable_diffusion_art.

Very helpfull and relaxing,

[–] CommunityLinkFixer@lemmings.world 5 points 2 years ago

Hi there! Looks like you linked to a Lemmy community using a URL instead of its name, which doesn't work well for people on different instances. Try fixing it like this: !stable_diffusion_art@lemmy.dbzer0.com

[–] silas@programming.dev 1 points 2 years ago* (last edited 2 years ago)

Talking to a text-to-image model is kinda like meeting someone from a different generation and culture that only half knows your language. You have to spend time with them to be able to communicate with them better and understand the “generational and cultural differences” so to speak.

Try checking out PromptHero or Civit.ai to see what prompts people are using to generate certain things.

Also, most text-to-image models are not made to be conversational and will work better if your prompts are similar to what you’d type in when searching for a photo on Google Images. For example, instead of a command like “Generate a photo for me of a…”, do “Disposable camera portrait photo, from the side, backlight…”

[–] altima_neo@lemmy.zip 0 points 2 years ago* (last edited 2 years ago)

Dall-E 3 seems to be the easiest to use and from my experience, does pretty well with prompts like that.

The issue is that it's quick to throttle you after a while and it's heavily censored for seemingly innocuous words.

Stable Diffusion can be a bit dumb sometimes, occasionally giving you an image of a person wearing jean everything. Now if you're willing to put in the time to learn to use Stable Diffusion, and you are able to run it on your PC, it's got a lot of freedom and unlimited image output as fast as your GPU can handle. You could use the "regional prompter" extension to mark zones where you want jeans to be, a specific shirt, etc. Or use inpaint to regenerate a masked area. It's more work, but it's very flexible and controllable.