zox

joined 2 years ago
[โ€“] zox@lemmy.world 3 points 3 days ago

That's the solution I take. I use Proxmox for a Windows VM which runs Ollama. That VM can then be used for gaming in the off chance a LLM isn't loaded. It usually is. I use only one 3090 due to the power load of my two servers on top of my [many] HDDs. The extra load of 2 isn't something I want to worry about.

I point to that machine through LiteLLM* which is then accessed through nginx which allows only Local IPs. Those two are in a different VM that hosts most of my docker containers.

*I found using Ollama and Open WebUI causes the model to get unloaded since they send slightly different calls. LiteLLM reduces that variance.

[โ€“] zox@lemmy.world 3 points 1 week ago (1 children)

What alternates do you use? I ask this less from self hosting and more for my software teams.

We currently use jira, confluence and gitlab vcs and CI for mostly AWS development. Any non jira (/confluence ) items you'd recommend?