this post was submitted on 26 Oct 2025
-132 points (8.2% liked)

Technology

76460 readers
3345 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

I prefer waterfox, OpenAI can keep its Chat chippy tea browser.

you are viewing a single comment's thread
view the rest of the comments
[–] brucethemoose@lemmy.world 2 points 4 days ago* (last edited 4 days ago) (2 children)

I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.

It is. I'm running this model, with hybrid CPU+GPU inference, specifically: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF

You can likely run GLM Air on your 3060 desktop if you have 48GB+ RAM, or a smaller MoE easily. Heck. I'll make a quant just for you, if you want.

Depending on the use case, I'd recommend ERNIE 4.5 21B (or 28B for vision) on your Macbook, or a Qwen 30B variant. Look for DWQ MLX quants, specifically: https://huggingface.co/models?sort=modified&search=dwq

[–] MagicShel@lemmy.zip 1 points 4 days ago (1 children)

I'm going to upgrade my ram shortly because I found a bad stick and I'm down to 16GB currently. I'll see if I can swing that order this weekend.

[–] brucethemoose@lemmy.world 2 points 4 days ago* (last edited 4 days ago) (1 children)

To what?

64G would be good, as that’s enough to fit GLM Air. There are some good 2x64GB 6000Mhz kits for 128GB as well.

[–] MagicShel@lemmy.zip 1 points 4 days ago (1 children)

I'll see about 128, then, but I'll probably do 64. Just depends on cost. Any recs?

[–] brucethemoose@lemmy.world 2 points 4 days ago* (last edited 4 days ago) (1 children)

For DDR5? Depends how much you care about latency:

https://pcpartpicker.com/products/memory/#ff=ddr5&Z=131072002&B=1000000000%2C1250000000&sort=price&page=1

The $342 Crucial kit is kinda a no-brainer for 128. Its timings aren’t great when overclocked, but it’s 5600 MHz out of the box, low voltage, and significantly cheaper per gigabyte than many 64GB/96GB kits. See for yourself:

https://www.igorslab.de/en/when-size-is-all-that-matters-crucial-2x-64-gb-ddr5-5600-kit-with-new-micron-32-gbit-ics-in-test-incl-gaming-and-overclocking/4/

The overclockability matters even less if you are on a 7000 series CPU.

I got the 1.25V Flare X5 kit because I wanted tighter timings for sim games, albeit at a MUCH lower price ($390) than it is now.

RAM prices seem to be rising (hence the price of my kit spiked by $200), so now is not a bad time to buy.

[–] MagicShel@lemmy.zip 1 points 4 days ago (1 children)

I appreciate it. I'm not going to overclock. I used to do that but these days I value stability over maximum performance. I'll go with your suggestion, thank you.

[–] brucethemoose@lemmy.world 2 points 4 days ago* (last edited 4 days ago)

You don’t have to overclock, but I'd at least look at your mobo's settings. Many mobos set really, really bad, non stock settings by default, especially with XMP memory.

An example: they might default to 1.3V VSOC which is absolutely a “default overclock” and is going to make idle power skyrocket, and your CPU potentially unstable because infinity fabric doesn’t like that. For reference, I personally wouldn’t go over 1.2V VSOC myself and shoot for like 1.1V.

I'd recommend Buildzoid's videos:

https://youtu.be/dlYxmRcdLVw

https://youtu.be/Xcn_nvWGj7U

And Igor's Lab for general text info.

Also, if you don’t turn on XMP at least (aka the RAM's rated speed), they will run at some slow default and hurt your inference speed rather significantly.

[–] MagicShel@lemmy.zip 0 points 4 days ago

I'm have to check. It's a pro, not air, but I think it's only 40GB total. I'm really new to Macs so the memory situation is unclear. I requested it at work specifically for its capability to run local AI.