this post was submitted on 26 Sep 2024
32 points (100.0% liked)

Linux

48287 readers
627 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

https://gitlab.com/christosangel/sapo3

  • Sapo3 is a suite of scripts-tools that can help the user convert a text file to an audio file.

  • It uses the tts-edge API for text-to-speech conversion.

  • Big txt files can be easily converted to audio books, using a wide range of customization capabilities.

When the user runs Sapo3, they will be presented with a menu of options:

  • o option: Fix name pronunciation with Fix Names

  • c option: Split text to chapters with Chapterize

  • v option: Convert File to audio

  • f option: Check every sentence outcome with Fix Audio option.

  • m option: Merging Audio Files

  • p option: Configuring Preferences

you are viewing a single comment's thread
view the rest of the comments
[–] christos@lemmy.world 6 points 1 month ago (2 children)

I totally undersand what you are saying. Initially, the original project used local text-to-speech, but was less than perfect, slower and cpu-costly.

You can check it out here https://gitlab.com/christosangel/sapo

Once a FOSS solution gets better and more usable, swapping the tts conversion is not a great deal.

[–] lime@feddit.nu 4 points 1 month ago (1 children)

shouldn't there at least be an option to use speech-dispatcher?

[–] christos@lemmy.world 2 points 1 month ago (1 children)

Do you mean an option to choose between various tts methods?

[–] lime@feddit.nu 3 points 1 month ago* (last edited 1 month ago) (1 children)

i believe that's what speech-dispatcher is; a uniform interface for tts systems.

[–] christos@lemmy.world 0 points 1 month ago (1 children)

speech-dispatcher

If you are referring to locally generated speech synthesis, the respecting outcome as far as I am concerned generally sounds generally poorer, and is more difficult to manage. However you can check out the original project https://gitlab.com/christosangel/sapo, where the audio files are generated locally.

[–] lime@feddit.nu 4 points 1 month ago (1 children)

well speech-dispatcher has no synthesis component, you can plug in any tts engine that follows the interface. it's nice to have a choice in engine just by implementing the support. personally i use piper which i feel gives a pretty good performance.

[–] christos@lemmy.world 4 points 1 month ago

piper

Indeed piper performs very well. Thank you for the input, I will most certainly consider adding the option to select tts engine in the near future, piper sounds totally worth it.

I'm somewhat surprised that there aren't a lot of good alternatives but uh, yeah, there doesn't seem to be.

I would have expected there to be at least one or two good TTS engines but I guess that assumption is quite wrong.

As to your other post, it's less that I care in any specific sense that Microsoft knows what I'm reading and more of a (admittedly irrational) dislike of providing anything that an ad company could maybe later use to sell me shit.