FaceDeer

joined 2 years ago
[–] FaceDeer@fedia.io -5 points 3 months ago (17 children)

Lots of people in this thread basically saying "I will voluntarily yield those job opportunities to people willing to use new technology."

Thanks, I guess?

[–] FaceDeer@fedia.io 45 points 3 months ago (2 children)

You may know IPv6 is ridiculously bigger, but you don't know it.

There are enough IPv6 addresses that you could give 10^17 addresses to every square millimeter of Earth's surface. Or 5×10^28 addresses for every living human being. On a more cosmic scale, you could issue 4×10^15 addresses to every star in the observable universe.

We're not going to run out by giving them to lightbulbs.

[–] FaceDeer@fedia.io 4 points 3 months ago (1 children)

I only just recently discovered that my installation of Whisper was completely unaware that I had a GPU, and was running entirely on my CPU. So even if you can't get a good LLM running locally you might still be able to get everything turned into text transcripts for eventual future processing. :)

[–] FaceDeer@fedia.io 12 points 3 months ago (4 children)

It's a bit technical, I haven't found any pre-packaged software to do what I'm doing yet.

First I installed https://github.com/openai/whisper , the speech-to-text model that OpenAI released back when they were less blinded by dollar signs. I wrote a Python script that used it to go through all of the audio files in the directory tree where I'm storing this stuff and produced a transcript that I stored in a .json file alongside it.

For the LLM, I installed https://github.com/LostRuins/koboldcpp/releases/ and used the https://huggingface.co/unsloth/Qwen3-30B-A3B-128K-GGUF model, which is just barely small enough to run smoothly on my RTX 4090. I wrote another Python script that methodically goes through those .json files that Whisper produced, takes the raw text of the transcript, and feeds it to the LLM with a couple of prompts explaining what the transcript is and what I'd like the LLM to do with it (write a summary, or write a bullet-point list of subject tags). Those get saved in the .json file too.

Most recently I've been experimenting with creating an index of the transcripts using those LLM results and the Whoosh library in Python, so that I can do local searches of the transcripts based on topics. I'm building towards writing up something where I can literally tell it "Tell me about Uncle Pete" and it'll first search for the relevant transcripts and then feed those into the LLM with a prompt to extract the relevant information from them.

If you don't find the idea of writing scripts for that sort of thing literally fun (like me) then you may need to wait a bit for someone more capable and more focused than I am to create a user-friendly application to do all this. In the meantime, though, hoard that data. Storage is cheap.

[–] FaceDeer@fedia.io 45 points 3 months ago (10 children)

Bear in mind, though, that the technology for dealing with these things are rapidly advancing.

I have an enormous amount of digital archives I've collected both from myself and from my now-deceased father. For years I just kept them stashed away. But about a year ago I downloaded the Whisper speech-to-text model from OpenAI and transcribed everything with audio into text form. I now have a Qwen3 LLM in the process of churning through all of those transcripts writing summaries of their contents and tagging them based on subject matter. I expect pretty soon I'll have something with good enough image recognition that I can turn loose on the piles of photographs to get those sorted out by subject matter too. Eventually I'll be able to tell my computer "give me a brief biography of Uncle Pete" and get something pretty good out of all that.

Yeah, boo AI, hallucinations, and so forth. This project has given me first-hand experience with what they're currently capable of and it's quite a lot. I'd be able to do a ton more if I wasn't restricting myself to what can run on my local GPU. Give it a few more years.

[–] FaceDeer@fedia.io 9 points 3 months ago (1 children)

So regulate the uses of the technology. Don't ban it outright.

Those companies are doing their manipulation currently by using the Internet and social media, should the Internet and social media be banned outright? We're using social media to discuss this right now, that discussion should be suppressed?

[–] FaceDeer@fedia.io 14 points 3 months ago (3 children)

"Someone might abuse it" is a reasonable concern. "Therefore nobody should be allowed to use it" is not a reasonable answer to that concern, IMO. We'd never have anything with that approach.

[–] FaceDeer@fedia.io 4 points 3 months ago

I'm not a Rust programmer, but I've released a lot of code under MIT in the past and my reason for picking it was because it was so simple and flexible when it comes to reusing it with code under other licenses.

I recall once, years ago, a user coming to me quite angry about how I was releasing code under a license that permitted corporations to "steal" it. Just for him I dual-licensed that particular bit of code under MIT and GPL. He never responded so I guess that satisfied him? Whatever, I'm just happy he went away.

[–] FaceDeer@fedia.io 4 points 3 months ago

It's actually not ridiculous, the 1m is how tall those rooms are. Nice consistent ceiling height all through the house.

[–] FaceDeer@fedia.io 4 points 3 months ago

Plenty of sunshine and fresh air! Just keep them well-watered and rotate them every once in a while to make sure they grow straight.

view more: ‹ prev next ›