overview for FaceDeer

IPv6 for self hosters in c/selfhosted@lemmy.world

[–] FaceDeer@fedia.io 45 points 5 months ago (2 children)

You may know IPv6 is ridiculously bigger, but you don't know it.

There are enough IPv6 addresses that you could give 10^17 addresses to every square millimeter of Earth's surface. Or 5×10^28 addresses for every living human being. On a more cosmic scale, you could issue 4×10^15 addresses to every star in the observable universe.

We're not going to run out by giving them to lightbulbs.

*Permanently Deleted* in c/technology@lemmy.world

[–] FaceDeer@fedia.io 4 points 5 months ago (1 children)

I only just recently discovered that my installation of Whisper was completely unaware that I had a GPU, and was running entirely on my CPU. So even if you can't get a good LLM running locally you might still be able to get everything turned into text transcripts for eventual future processing. :)

*Permanently Deleted* in c/technology@lemmy.world

[–] FaceDeer@fedia.io 12 points 5 months ago (4 children)

It's a bit technical, I haven't found any pre-packaged software to do what I'm doing yet.

First I installed https://github.com/openai/whisper , the speech-to-text model that OpenAI released back when they were less blinded by dollar signs. I wrote a Python script that used it to go through all of the audio files in the directory tree where I'm storing this stuff and produced a transcript that I stored in a .json file alongside it.

For the LLM, I installed https://github.com/LostRuins/koboldcpp/releases/ and used the https://huggingface.co/unsloth/Qwen3-30B-A3B-128K-GGUF model, which is just barely small enough to run smoothly on my RTX 4090. I wrote another Python script that methodically goes through those .json files that Whisper produced, takes the raw text of the transcript, and feeds it to the LLM with a couple of prompts explaining what the transcript is and what I'd like the LLM to do with it (write a summary, or write a bullet-point list of subject tags). Those get saved in the .json file too.

Most recently I've been experimenting with creating an index of the transcripts using those LLM results and the Whoosh library in Python, so that I can do local searches of the transcripts based on topics. I'm building towards writing up something where I can literally tell it "Tell me about Uncle Pete" and it'll first search for the relevant transcripts and then feed those into the LLM with a prompt to extract the relevant information from them.

If you don't find the idea of writing scripts for that sort of thing literally fun (like me) then you may need to wait a bit for someone more capable and more focused than I am to create a user-friendly application to do all this. In the meantime, though, hoard that data. Storage is cheap.

*Permanently Deleted* in c/technology@lemmy.world

[–] FaceDeer@fedia.io 45 points 5 months ago (10 children)

Bear in mind, though, that the technology for dealing with these things are rapidly advancing.

I have an enormous amount of digital archives I've collected both from myself and from my now-deceased father. For years I just kept them stashed away. But about a year ago I downloaded the Whisper speech-to-text model from OpenAI and transcribed everything with audio into text form. I now have a Qwen3 LLM in the process of churning through all of those transcripts writing summaries of their contents and tagging them based on subject matter. I expect pretty soon I'll have something with good enough image recognition that I can turn loose on the piles of photographs to get those sorted out by subject matter too. Eventually I'll be able to tell my computer "give me a brief biography of Uncle Pete" and get something pretty good out of all that.

Yeah, boo AI, hallucinations, and so forth. This project has given me first-hand experience with what they're currently capable of and it's quite a lot. I'd be able to do a ton more if I wasn't restricting myself to what can run on my local GPU. Give it a few more years.

Valve CEO Gabe Newell’s Neuralink competitor is expecting its first brain chip this year in c/technology@lemmy.world

[–] FaceDeer@fedia.io 9 points 5 months ago (1 children)

So regulate the uses of the technology. Don't ban it outright.

Those companies are doing their manipulation currently by using the Internet and social media, should the Internet and social media be banned outright? We're using social media to discuss this right now, that discussion should be suppressed?

Valve CEO Gabe Newell’s Neuralink competitor is expecting its first brain chip this year in c/technology@lemmy.world

[–] FaceDeer@fedia.io 14 points 5 months ago (3 children)

"Someone might abuse it" is a reasonable concern. "Therefore nobody should be allowed to use it" is not a reasonable answer to that concern, IMO. We'd never have anything with that approach.

[Open question] Why are so many open-source projects, particularly projects written in Rust, MIT licensed? in c/technology@lemmy.world

[–] FaceDeer@fedia.io 4 points 5 months ago

I'm not a Rust programmer, but I've released a lot of code under MIT in the past and my reason for picking it was because it was so simple and flexible when it comes to reusing it with code under other licenses.

I recall once, years ago, a user coming to me quite angry about how I was releasing code under a license that permitted corporations to "steal" it. Just for him I dual-licensed that particular bit of code under MIT and GPL. He never responded so I guess that satisfied him? Whatever, I'm just happy he went away.

What entrance? (I asked Gemini to create a floor plan) in c/imageai@sh.itjust.works

[–] FaceDeer@fedia.io 4 points 5 months ago

It's actually not ridiculous, the 1m is how tall those rooms are. Nice consistent ceiling height all through the house.

What entrance? (I asked Gemini to create a floor plan) in c/imageai@sh.itjust.works

[–] FaceDeer@fedia.io 4 points 5 months ago

Plenty of sunshine and fresh air! Just keep them well-watered and rotate them every once in a while to make sure they grow straight.

Tesla Full-Self Driving Veers Off Road, Hits Tree, and Flips Car for No Obvious Reason (No Serious Injuries, but Scary) in c/technology@lemmy.world

[–] FaceDeer@fedia.io 5 points 5 months ago (4 children)

Elon Musk decided they absolutely would not use lidar, years ago when lidar was expensive enough that a decision like that made economic sense to at least try making work. Nowadays lidar is a lot cheaper but for whatever reason Musk has drawn a line in the sand and refuses to back down on it.

Unlike many people online these days I don't believe that Musk is some kind of sheer-luck bought-his-way-into-success grifter, he has been genuinely involved in many of the decisions that made his companies grow. But this is one of the downsides of that (Cybertruck is another). He's forced through ideas that turned out to be amazing, but he's also forced through ideas that sucked. He seems to be increasingly having trouble distinguishing them.

Stack overflow is almost dead in c/technology@lemmy.world

[–] FaceDeer@fedia.io 0 points 5 months ago (1 children)

You're still setting a high standard here. What counts as a "well trained" human and how many SO commenters count as that? Also "easier to teach" is complicated. It takes decades for a human to become well trained, an LLM can be trained in weeks. And an individual computer that'll be running the LLM is "trained" in minutes, it just needs to load the model into memory. Once you have an LLM you can run as many instances of it as you want to spend money on.

There's no guarantee LLM will get reliably better at everything

Never said they would. I said they're as bad as they're ever going to be, which allows for the possibility that they don't get any better.

Even if they don't, though, they're still good enough to have killed Stack Overflow.

It still makes some mistakes today that it did when introduced and nobody knows how to fix that yet

And humans also make mistakes. Do we know how to fix that yet?

Researchers Scrape 2 Billion Discord Messages and Publish Them Online in c/technology@lemmy.world

[–] FaceDeer@fedia.io 35 points 5 months ago

If they aren't comfortable with their Discord messages being public, perhaps they shouldn't have posted those messages in a public forum that the public can access.