Selfhosted

58061 readers

719 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

Any of you have a self-hosted AI "hub"? (e.g. for LLM, stable-diffusion, ...) (lemmy.ml)

submitted 2 years ago by robber@lemmy.ml to c/selfhosted@lemmy.world

21 comments fedilink hide all child comments

I've been looking into self-hosting LLMs or stable diffusion models using something like LocalAI and / or Ollama and LibreChat.

Some questions to get a nice discussion going:

Any of you have experience with this?
What are your motivations?
What are you using in terms of hardware?
Considerations regarding energy efficiency and associated costs?
What about renting a GPU? Privacy implications?

you are viewing a single comment's thread
view the rest of the comments

[–] Greg@lemmy.ca 7 points 2 years ago (1 children)

I've installed Ollama on my Gaming Rig (RTX4090 with 128GB ram), M3 MacBook Pro, and M2 MacBook Air. I'm running Open WebUI on my server which can connect to multiple Ollama instances. Open WebUI has it's own Ollama compatible API which I use for projects. I'll only boot up my gaming rig if I need to use larger models, otherwise the M3 MacBook Pro can handle most tasks.

[–] JackGreenEarth@lemm.ee 3 points 2 years ago (1 children)

Is that 128GB of VRAM? Because normal RAM doesn't matter unless you want to run the model on the CPU, which is much slower.

[–] Greg@lemmy.ca 2 points 2 years ago (1 children)

That's 128GB RAM, the GPU has 24GB VRAM. Ollama has gotten pretty smart with resource allocation. Smaller models can fit soley on my VRAM but I can still run larger models on RAM.

[–] JackGreenEarth@lemm.ee 1 points 2 years ago

Any tips on how to get stable diffusion to do that? I'm running it through Krita's AI Image Generation plugin, and with my 6GB VRAM and 16GB RAM, the VRAM is quite limited if I want to inpaint larger images, I keep getting 'out of VRAM' errors. How do I make it switch to RAM when VRAM is full? Or with Jan for that matter, how can I get it to partially use RAM and partially VRAM so I can get it to run models larger than 7B?