AI Generated Images

7178 readers

126 users here now

Community for AI image generation. Any models are allowed. Creativity is valuable! It is recommended to post the model used for reference, but not a rule.

No explicit violence, gore, or nudity.

This is not a NSFW community although exceptions are sometimes made. Any NSFW posts must be marked as NSFW and may be removed at any moderator's discretion. Any suggestive imagery may be removed at any time.

Refer to https://lemmynsfw.com/ for any NSFW imagery.

No misconduct: Harassment, Abuse or assault, Bullying, Illegal activity, Discrimination, Racism, Trolling, Bigotry.

AI Generated Videos are allowed under the same rules. Photosensitivity warning required for any flashing videos.

To embed images type:

“![](put image url in here)”

Follow all sh.itjust.works rules.

Community Challenge Past Entries

Related communities:

!auai@programming.dev
Useful general AI discussion
!aiphotography@lemmings.world
Photo-realistic AI images
!stable_diffusion_art@lemmy.dbzer0.com Stable Diffusion Art
!share_anime_art@lemmy.dbzer0.com Stable Diffusion Anime Art
!botart@lemmy.dbzer0.com AI art generated through bots
!degenerate@lemmynsfw.com
NSFW weird and surreal images
!aigen@lemmynsfw.com
NSFW AI generated porn

founded 1 year ago

MODERATORS

thelsim@sh.itjust.works

god@sh.itjust.works

noodle@sh.itjust.works

theUnlikely@sopuli.xyz

M0oP0o@mander.xyz

Deceptichum@quokk.au

Homie's gonna have a wild night (lemmy.world)

submitted 11 months ago by BackOnMyBS@lemmy.world to c/imageai@sh.itjust.works

7 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Mechanite@lemmy.world 2 points 11 months ago (1 children)

Is TTS voice replication much better self hosted than it was about half a year ago? Last time I tried it didn't sound like the person I had samples of, instead I had to use elevenlabs to get close to sounding right

[–] tal@lemmy.today 2 points 11 months ago* (last edited 11 months ago)

I mean, that's a subjective question. I think it's decent. Here's some samples:

https://nonint.com/static/tortoise_v2_examples.html

Last time I was running it, Tortoise TTS didn't have a way to directly annotate voice with intonation or emotional stuff. The best you can do is pulling tricks like using a feature that lets you add some words to a sentence that aren't actually spoken to affect the emotional impact of the words that are (e.g. sad words to make the spoken words be spoken in a sad voice).

Imagine the difference between someone saying gloatingly "none of you will survive" and someone saying it in an agonized voice.

I do wonder a bit whether it'd be possible to train it on a corpus that's been automatically annotated with output from software that does sentiment analysis on text, and then generate keywords that one could use to alter the sound of sentences. I don't think that this is so much a fundamental limitation of the software as it is limitations in the training set.