this post was submitted on 30 Aug 2025

209 points (88.0% liked)

Selfhosted

52735 readers

275 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

209

1U mini PC for AI? (startrek.website)

submitted 2 months ago by nagaram@startrek.website to c/selfhosted@lemmy.world

66 comments fedilink hide all child comments

My rack is finished for now (because I'm out of money).

Last time I posted I had some jank cables going through the rack and now we're using patch panels with color coordinated cables!

But as is tradition, I'm thinking about upgrades and I'm looking at that 1U filler panel. A mini PC with a 5060ti 16gb or maybe a 5070 12gb would be pretty sick to move my AI slop generating into my tiny rack.

I'm also thinking about the PI cluster at the top. Currently that's running a Kubernetes cluster that I'm trying to learn on. They're all PI4 4GB, so I was going to start replacing them with PI5 8/16GB. Would those be better price/performance for mostly coding tasks? Or maybe a discord bot for shitposting.

Thoughts? MiniPC recs? Wanna bully me for using AI? Please do!

you are viewing a single comment's thread
view the rest of the comments

[–] brucethemoose@lemmy.world 3 points 2 months ago* (last edited 2 months ago) (1 children)

If you can swing $2K, get one of the new mini PCs with an AMD 395 and 64GB+ RAM (ideally 128GB).

They're tiny, lower power, and the absolute best way to run the new MoEs like Qwen3 or GLM Air for coding. TBH they would blow a 5060 TI out of the water, as having a ~100GB VRAM pool is a total game changer.

I would kill for one on an ITX mobo with an x8 slot.

[–] princessnorah@lemmy.blahaj.zone 4 points 2 months ago (2 children)

I think the mainboard from the Framework Desktop meets your requirements: https://frame.work/au/en/products/framework-desktop-mainboard-amd-ryzen-ai-max-300-series?v=FRAFMK0002

[–] MalReynolds@piefed.social 3 points 2 months ago (1 children)

Pretty sure that's a x4 PCIe slot (admittedly PCIe 5x4, but not many video cards speak PCIe5), would totally trade a usb4 for a x8, but these laptop chips are pretty constrained lanes wise.

[–] brucethemoose@lemmy.world 4 points 2 months ago* (last edited 2 months ago) (1 children)

It's PCIe 4.0 :(

but these laptop chips are pretty constrained lanes wise

Indeed. I read Strix Halo only has 16 4.0 PCIe lanes in addition to its USB4, which is resonable given this isn't supposed to be paired with discrete graphics. But I'd happily trade an NVMe slot (still leaving one) for x8.

One of the links to a CCD could theoretically be wired to a GPU, right? Kinda like how EPYC can switch its IO between infinity fabric for 2P servers, and extra PCIe in 1P configurations. But I doubt we'll ever see such a product.

[–] MalReynolds@piefed.social 2 points 2 months ago (1 children)

It's PCIe 4.0 :(

Boo! Silly me thinking DDR5 implied PCIe5, what a shame.

Feels like they're testing the waters with Halo, hopefully a loud 'waters great, dive in' signal gets through and we get something a bit fitter for desktop use, maybe with more memory (and bandwidth) next gen. Still, gotta love the power usage, makes for one hell of a NAS / AI inference server (and inference isn't that fussy about PCIe bandwidth, hell eGPU works fine as long as the model / expert fits in VRAM.

[–] brucethemoose@lemmy.world 2 points 2 months ago* (last edited 2 months ago) (1 children)

Rumor is it’s successor is 384 bit, and after that their designs are even more modular:

https://www.techpowerup.com/340372/amds-next-gen-udna-four-die-sizes-one-potential-96-cu-flagship

Hybrid inference prompt processing actually is pretty sensitive to PCIe bandwidth, unfortunately, but again I don’t think many people intend on hanging an AMD GPU off these Strix Halo boards, lol.

[–] princessnorah@lemmy.blahaj.zone 1 points 2 months ago (1 children)

I don't know that that is necessarily true. Having a gaming machine that can play any game and dynamically switches between a high-power draw dGPU and a genuinely capable low-power draw iGPU actually sounds amazing. That's always been possible with every laptop that has a dGPU but their associated iGPU has often been bottom of the barrel bc "why would you use it" for intensive tasks. But a "desktop" build as a lounge room gaming PC, where you can throw whatever at it and it'll run as quietly as it can, while being able to play AAAs at 4K60, sounds amazing.

[–] brucethemoose@lemmy.world 2 points 2 months ago* (last edited 2 months ago) (1 children)

Eh, actually that’s not what I had in mind:

Discrete desktop graphics idle hot. I think my 3090 uses at least 40W doing literally nothing.
It’s always better to run big dies slower than small dies at high clockspeeds. In other words, if you underclocked a big desktop GPU to 1/2 its peak clockspeed, it would use less than a fourth of the energy and run basically inaudible… and still be faster than the iGPU. So why keep a big iGPU around?

My use case was multitasking and compute stuff. EG game/use the discrete GPU while your IGP churns away running something. Or combine them in some workloads.

Even the 395 by itself doesn’t make a ton of sense for an HTPC because AMD slaps so much CPU on it. It’s way too expensive and makes it power thirsty. A single CCD (8 cores instead of 16) + the full integrated GPU would be perfect and lower power, but AMD inexplicably does not offer that.

Also, I’ll add that my 3090 is basically inaudible next to a TV… key is to cap its clocks, and the fans barely even spin up.

[–] princessnorah@lemmy.blahaj.zone 1 points 2 months ago (1 children)

That's all valid for your usecase, but you were saying that you didn't think many people would use it that way at all and that's what I was saying I didn't agree with. As well, a HTPC is kind of a different use case altogether to a lounge room gaming computer. There's some overlap for sure, but if you want zero compromise gaming then you're going to want all that CPU.

[–] brucethemoose@lemmy.world 1 points 2 months ago* (last edited 2 months ago) (1 children)

Eh, but you’d be way better off with an X3D CPU in that scenario, which is both significantly faster in games, about as fast outside them (unless you’re dram bandwidth limited) and more power efficient (because they clock relatively low).

You’re right about the 395 being a fine HTPC machine by itself.

But I’m also saying even an older 7900, 4090 or whatever would be way lower power at the same performance as the 395's IGP, and whisper quiet in comparison. Even if cost is no object. And if that’s the case, why keep a big IGP at all? It just doesn’t make sense to pair them without some weirdly specific use case that can use both at once, or that a discrete GPU literally can’t do because it doesn’t have enough VRAM like the 395 does.

[–] princessnorah@lemmy.blahaj.zone 1 points 2 months ago* (last edited 2 months ago) (1 children)

Correct me if I'm wrong here, but is the 395 not leagues ahead of something like a 4090 when it comes to performance per watt? Here's a comparison graph of a 4090 against the Radeon 8060S, which is the 395's iGPU:

Source.

Now that's apparently running at the 395's default TDP of 55W so that includes the CPU power. It's also clear that a 4090 can trounce it on sheer performance when needed. But if we take a look at this next graph:

Source.

This shows that a 4090 has a third of the performance while still running at 130W, more than twice the TDP of the entire 395 APU.

Edit: This was buried in the comments under that second graph but here's the points scored per Watt on that benchmark: 130W = 66 / 180W = 85 / 220W = 92 / 270W = 84 / 330W = 74 / 420W = 59 / 460W = 55 and this clearly shows the sweet spot for a 4090 is 220W.

[–] brucethemoose@lemmy.world 2 points 2 months ago (1 children)

Oh wow, that's awesome! I didn't know folks ran TDP tests like this, just that my old 3090 seems to have a minimum sweet spot around that same same ~200W based on my own testing, but I figured the 4000 or 5000 series might go lower. Apparently not, at least for the big die.

I also figured the 395 would draw more than 55W! That's also awesome! I suspect newer, smaller GPUs like the 9000 or 5000 series still make the value proposition questionable, but still you make an excellent point.

And for reference, I just checked, and my dGPU hovers around 30W idle with no display connected.

[–] princessnorah@lemmy.blahaj.zone 2 points 2 months ago (1 children)

You can boost the 395 up to 120W, which might be where Framework is pushing it too, but those benchmarks are labelled 55W and that's what AMD says is the default clock without adjustment. I'd love to see how the benchmarks compare at that higher boost but I'd imagine it's diminishing returns similar to most GPUs. I think the benefit to using it in a lounge gaming PC would be the super low power draw, but you would need to figure out a display MUX switch and I don't think that's simple with desktop cards. Maybe something with a 5090 mobile would be the go at that point, but I have no idea how that compares to the 395 and whether it's worth it.

[–] brucethemoose@lemmy.world 1 points 2 months ago* (last edited 2 months ago)

Mobile 5090 would be an underclocked, binned desktop 5080, AFAIK:

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_50_series

In KCD2 (a fantastic CryEngine game, a great benchmark IMO), at QHD, the APU is a hair less half as fast. For instance, 39 FPS at QHD vs 84 FPS for the mobile 5090:

https://www.notebookcheck.net/Nvidia-GeForce-RTX-5090-Laptop-Benchmarks-and-Specs.934947.0.html

https://www.notebookcheck.net/AMD-Radeon-8060S-Benchmarks-and-Specs.942049.0.html

Synthetic benchmarks between the two

But these are both presumably running at high TDP (150W for the 5090). Also, the mobile 5090 is catastrophically overpriced and inevitably tied to a weaker CPU, whereas the APU is a monster of a CPU. So make of that what you will.

[–] brucethemoose@lemmy.world 2 points 2 months ago* (last edited 2 months ago)

Nah, unfortunately it is only PCIe 4.0 4x. That's a bit slim for a dGPU, especially in the future :(