this post was submitted on 23 May 2024
923 points (97.4% liked)
Technology
59589 readers
2838 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Most of 7b-8b models run just fine in 4bits quant and won’t use more than 4 or 5 GB of VRAM.
The only important metric is the amount of VRAM as the model must be loaded in VRAM for fast inference.
You could use CPU and RAM but it is really painfully slow.
If you got an Apple Silicon Mac it could be even simpler.
I have an Intel Celeron Mobile laptop with iGPU and, I think, 256MB VRAM. How many bs does that get me for the LLM?
~~Only half-joking. That's my still functional old daily driver now serving as homelab~~
Well, I got a good news and a bad news.
The bad news is you won't do shit with that my dear friend.
The good news is that you won't need it because the duck is back.