this post was submitted on 09 Sep 2024
297 points (99.0% liked)
Technology
59589 readers
2838 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
At the moment I just don't. I got kobolcpp to run through distrobox / boxbuddy but I can't get it to compile with rocm, so I can only use CPU generation, which is abysmally slow. Might go back to NovelAI when they release their new model if I can't find a solution.
What card do you use? I have a 6700XT and getting anything with ROCM running for me requires that I pass the
HSA_OVERRIDE_GFX_VERSION=10.3.0
environmental variable to the related process, otherwise it just refuses to run properly. I wonder if it might be something similar for you too?5500 here. I can't use any recent rocm version because the GFX override I use is for a card that apparently has a couple more instructions and the newer kernels instantly crash with an illegal operation exception.
I found a build someone made buried in a docker image and it indeed does work, without override, for the 5500 but it's using all generic code for the kernels and is like 4x slower than the ancient version.
What's ultimately the worst thing about this isn't that AMD isn't supporting all cards for rocm -- it's that the support is all or nothing. There's no "we won't be spending time on this but it passes automated tests so ship it" kind of thing. "oh the new kernels broke that old card tough luck you don't get new kernels".
So in the meantime I'm living with the occasional (every couple of days?) freeze when using rocm because I can't reasonably upgrade. Not just the driver crashes, the kernel tries to restart it, the whole card needs a reset before doing anything but display a vga console.
Yeah, I definitely am not a fan of how AMD handles rocm - there's so many weird cases of "Well this card should work with rocm, but... [insert some weird quirk that you have to do, like the one I mentioned, or what you've run into]".
Userspace/consumer side I enjoy AMD, but I fully understand why a lot of devs don't make use of rocm and why Nvidia has such a tight hold on things in the GPU compute world with CUDA.