this post was submitted on 22 Jun 2026
25 points (100.0% liked)
Selfhosted
60074 readers
715 users here now
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam.
-
Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.
-
Don't duplicate the full text of your blog or git here. Just post the link for folks to click.
-
Submission headline should match the article title.
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Maybe do some deduplication first? https://github.com/qarmin/czkawka
That certainly would thin the pile and cut down on the manual labor aspect. It uses a hashing method, iirc. Used it to thin out duplicate audio files, being careful not to delete files that might have the same filename but one would be a live rendition, and a studio/album rendition of the same song. Jimi Hendrix, in my experience, is notorious for this. One of the things that I dig about him, is that he never really performed the same song the same way. He sort of just really went with a stream of consciousness and pulled it off quite well.