There's a difference between 'language' and 'intelligence' which is why so many people think that LLMs are intelligent despite not being so.
The thing is, you can't train an LLM on math textbooks and expect it to understand math, because it isn't reading or comprehending anything. AI doesn't know that 2+2=4 because it's doing math in the background, it understands that when presented with the string 2+2=, statistically, the next character should be 4. It can construct a paragraph similar to a math textbook around that equation that can do a decent job of explaining the concept, but only through a statistical analysis of sentence structure and vocabulary choice.
It's why LLMs are so downright awful at legal work.
If 'AI' was actually intelligent, you should be able to feed it a few series of textbooks and all the case law since the US was founded, and it should be able to talk about legal precedent. But LLMs constantly hallucinate when trying to cite cases, because the LLM doesn't actually understand the information it's trained on. It just builds a statistical database of what legal writing looks like, and tries to mimic it. Same for code.
People think they're 'intelligent' because they seem like they're talking to us, and we've equated 'ability to talk' with 'ability to understand'. And until now, that's been a safe thing to assume.
You'd get even more savings using something like
beesbecause it does block level deduplication.What
beesdoes is build a hash table of every block on your ssd, and compares them. If it finds any matches, it will delete one and just place a pointer to the other where the deleted one was, the pointer being much smaller than the duplicate data block.Functionally, any installed games with shared assets get space savings. It's particularly helpful on with Steam games because of all the proton prefixes. Lots of opportunities for finding duplicate data blocks.
If you use snapshots, it can save even more.