this post was submitted on 27 Jul 2024
338 points (99.7% liked)
Linux
48328 readers
598 users here now
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
CUDA hell remains. :(
So is CUDA good or bad?
I keep reading it's hell, but the best. Apparently it's the single one reason why Nvidia is so big with AI, but it sucks.
What is it?
Both.
The good: CUDA is required for maximum performance and compatibility with machine learning (ML) frameworks and applications. It is a legitimate reason to choose Nvidia, and if you have an Nvidia card you will want to make sure you have CUDA acceleration working for any compatible ML workloads.
The bad: Getting CUDA to actually install and run correctly is a giant pain in the ass for anything but the absolute most basic use case. You will likely need to maintain multiple framework versions, because new ones are not backwards-compatible. You'll need to source custom versions of Python modules compiled against specific versions of CUDA, which opens a whole new circle of Dependency Hell. And you know how everyone and their dog publishes shit with Docker now? Yeah, have fun with that.
That said, AMD's equivalent (ROCm) is just as bad, and AMD is lagging about a full generation behind Nvidia in terms of ML performance.
The easy way is to just use OpenCL. But that's not going to give you the best performance, and it's not going to be compatible with everything out there.
almost sounds like god doesn't want us doing machine learning