Maybe do some deduplication first? https://github.com/qarmin/czkawka
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam.
-
Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.
-
Don't duplicate the full text of your blog or git here. Just post the link for folks to click.
-
Submission headline should match the article title.
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
czkawka
That certainly would thin the pile and cut down on the manual labor aspect. It uses a hashing method, iirc. Used it to thin out duplicate audio files, being careful not to delete files that might have the same filename but one would be a live rendition, and a studio/album rendition of the same song. Jimi Hendrix, in my experience, is notorious for this. One of the things that I dig about him, is that he never really performed the same song the same way. He sort of just really went with a stream of consciousness and pulled it off quite well.
The problem I see here is that the best images are those that have been edited, i.e. rotated, cropped, and adjusted brightness/contrast/saturation/white balance. Like I can have a thousand snaps from my camera, but they'll only look good once I pick them out and edit them.
I suppose, if you have a consistent camera with photos in consistent conditions, you could apply edits in bulk. And I know Google has automated crop/rotate/etc features in the Google Photos app, so maybe you could find a self-hosted tool smart enough to do that across all your photos, then make a review pass to pick out the good ones.
I am unsure how you would go about that without AI. You could probably write a python script that hooks in with AI, that ranks by focus, brightness, saturation or other such criteria. However, I suspect that there would have to be a fair amount of manual labor to do that. That's an interesting request. I'll watch the thread and see what the outcome is.