Linux

48328 readers

502 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
No misinformation
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago

MODERATORS

AgreeableLandscape@lemmy.ml

nooter692@lemmy.ml

MarcellusDrum@lemmy.ml

cypherpunks@lemmy.ml

cyclohexane@lemmy.ml

d3Xt3r@lemmy.nz

Alpaca Simplifies Running Advanced AI Language Models on Linux (freedomproject.pages.dev)

submitted 3 months ago by polka_dot_5@lemmy.today to c/linux@lemmy.ml

23 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] thingsiplay@beehaw.org 2 points 3 months ago (1 children)

Wow it got worse for me. Maybe through last update? Is this probably related to he application? Now I get 12 t/s on my CPU and switching to GPU it's only 1.5 t/s. Something is fishy. With Nous hermes 2 Mistral 7B DPO with q4 I get 33 t/s (I believe it was up to 44 before).

Now I'm curious if this will happen with a different application too, but I have nothing else than GPT4All installed.

[–] Onihikage@beehaw.org 2 points 3 months ago (1 children)

Unfortunately I can't even test Llama 3.1 in Alpaca because it refuses to download, showing some error message with the important bits cut off.

That said, the Alpaca download interface seems much more robust, allowing me to select a model and then select any version of it for download, not just apparently picking whatever version it thinks I should use. That's an improvement for sure. On GPT4All I basically have to download the model manually if I want one that's not the default, and when I do that there's a decent chance it doesn't run on GPU.

However, GPT4All allows me to plainly see how I can edit the system prompt and many other parameters the model is run with, and even configure multiple sets of parameters for the same model. That allows me to effectively pre-configure a model in much more creative ways, such as programming it to be a specific character with a specific background and mindset. I can get the Mistral model from earlier to act like anything from a very curt and emotionally neutral virtual intelligence named Jarvis to a grumpy fantasy monster whose behavior is transcribed by a narrator. GPT4All can even present an API endpoint to localhost for other programs to use.

Alpaca seems to have some degree of model customization, but I can't tell how well it compares, probably because I'm not familiar with using ollama and I don't feel like tinkering with it since it doesn't want to use my GPU. The one thing I can see that's better in it is the use of multiple models at the same time; right now GPT4All will unload one model before it loads another.

[–] thingsiplay@beehaw.org 2 points 3 months ago* (last edited 3 months ago)

That's quite unfortunate. ~~Alpaca needs to support those explicitly to work with the new 3.1 128k models; GPT4All was not compatible with it before update either. There was a bug in some library they was using and needed a patch. So maybe that's why you can't use the new Llama 3.1 in Alpaca.~~ (Edit: Never mind. On the webpage they advertise and talk about 3.1 being working, so a wrong guess by me probably.)

Actually that sounds very useful and I missed that option, to be able to select from a set of related models. One thing that GPT4All can also do is, analyzing text files and then using the data to ask questions about it. It will also output the exact lines of the file in relation to the answer. I only experimented a little bit with this, but sounds useful too. The team also experiments and works on a web search using, but no idea how that would work with a local model if ever.