Selfhosted

59955 readers

320 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam.
Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.
Don't duplicate the full text of your blog or git here. Just post the link for folks to click.
Submission headline should match the article title.
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago

MODERATORS

curbstickle@anarchist.nexus

curbstickle_lw@lemmy.world

Frustratingly bad at self hosting. Can someone help me access LLMs on my rig from my phone (lemmy.zip)

submitted 9 months ago by BlackSnack@lemmy.zip to c/selfhosted@lemmy.world

43 comments fedilink hide all child comments

tl-dr

-Can someone give me step by step instructions (ELI5) on how to get access to my LLM's on my rig from my phone?

Jan seems the easiest but I've tried with Ollama, librechat, etc.

.....

I've taken steps to secure my data and now I'm going the selfhosting route. I don't care to become a savant with the technical aspects of this stuff but even the basics are hard to grasp! I've been able to install a LLM provider on my rig (Ollama, Librechat, Jan, all of em) and I can successfully get models running on them. BUT what I would LOVE to do is access the LLM's on my rig from my phone while I'm within proximity. I've read that I can do that via wifi or LAN or something like that but I have had absolutely no luck. Jan seems the easiest because all you have to do is something with an API key but I can't even figure that out.

Any help?

you are viewing a single comment's thread
view the rest of the comments

[–] tal@lemmy.today 1 points 9 months ago* (last edited 9 months ago) (1 children)

Ollama does have some features that make it easier to use for a first-time user, including:

Calculating automatically how many layers can fit in VRAM and loading that many layers and splitting between main memory/CPU and VRAM/GPU. llama.cpp can't do that automatically yet.
Automatically unloading the model from VRAM after a period of inactivity.

I had an easier time setting up ollama than other stuff, and OP does apparently already have it set up.

[–] brucethemoose@lemmy.world 1 points 9 months ago* (last edited 9 months ago)

Yeah. But it also messes stuff up from the llama.cpp baseline, and hides or doesn't support some features/optimizations, and definitely doesn't support the more efficient iq_k quants of ik_llama.cpp and its specialzied MoE offloading.

And that's not even getting into the various controversies around ollama (like broken GGUFs or indications they're going closed source in some form).

...It just depends on how much performance you want to squeeze out, and how much time you want to spend on the endeavor. Small LLMs are kinda marginal though, so IMO its important if you really want to try; otherwise one is probably better off spending a few bucks on an API that doesn't log requests.