this post was submitted on 25 Apr 2026
24 points (80.0% liked)

Selfhosted

58781 readers
314 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

  7. No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I'm looking to build a low-end ollama LLM server to improve home assistant voice control, Immich image recognition and a few other services. With the current cost of hardware components like memory, I'm looking to build something small, but somewhat expandable.

I have an old micro-atx form factor computer that I'm thinking will be a good option to upgrade. I'd love recommendations on motherboards, processors, and video card combos that would likely be compatible and sufficient to run a decent server while keeping costs lower, basically, the best bang for the buck. I have a couple of M.2 SSDs I can re-purpose. Would prefer the motherboard has 2.5Gbit Ethernet, but otherwise I'm open.

Also recommendations on sites to purchase good quality memory at reasonable prices that ship to the US. I'd be willing to look at lightly used components, too.

Any advice on any of these topics would be greatly appreciated. The advice I've found has all been out of date especially with crypto fading so video cards are not as expensive, but LLM data centers eating up and reserving memory before it's even manufactured.

you are viewing a single comment's thread
view the rest of the comments
[–] vegetaaaaaaa@lemmy.world 1 points 3 hours ago* (last edited 3 hours ago)

I suggest using llama.cpp instead of ollama, you can easily squeeze +10% in inference speed and other memory optimizations from llama.cpp. With hardware prices nowadays I think every % saved on resources matters. Here is a simple ansible role to setup llama.cpp, it should give you a good idea of how to deploy it.

A dedicated inference rig is not gonna be cheap. What I did, since I need a gaming rig; is getting 32GB DDR5 (this was before the current RAMpocalypse, if I had known I would have bought 64) and an AMD 9070 (16GB VRAM - again if I had known how crazy prices would get I'd probably ahve bought a 24GB VRAM card). The home server runs the usual/non-AI stuff, and llamacpp runs on the gaming desktop (the home server just has a proxy to it). Yeah the gaming desktop has to be powered up when I want to run inference, this is my main desktop so it's powered on most of the time, no big deal