Scientific workloads often involve very large datasets. It might be high resolution data captured from various sensors, or it might be more “normal” data but in huge quantities. Assuming the data itself is high quality, larger datasets mean more accurate conclusions.
bamboo
While I’ve never given Reddit a penny, it was totally different back then. In those times, the site was much smaller, and buying gold got you r/lounge access and supported the site. They felt more community oriented and weren’t aggressively monetizing the service. Nowadays it’s like paying for Facebook or twitter, absolutely not.
The fediverse is an excellent place to find training data for AIs. I would just set up a bot that follows a bunch of people and let them send their data to me, then I don’t even need to bother with scraping.
That’s true for all commercial development. No company wants to invest more than they have to. Upstreaming does save time in the long run, but not in the short term.
I don’t think this is as much of a problem, proprietary hardware is a thing on x86 too. The two big problems are a lack of boot standardization, and vendors not upstreaming their device drivers. A lack of standardization means it is difficult or impossible to use a single image to boot across different devices, and the lack of upstream drivers means even if you solved the boot process, you won’t be able to interface with peripherals without using a very custom kernel.
I can understand why a project might want to do this until the law is fully implemented and testing in court, but I can tell most of the people in this thread haven’t actually figured out how to effectively use LLMs productively. They’re not about to replace software engineers, but as a software engineer, tools like GitHub copilot and ChatGPT are excellent at speeding up a workflow. ChatGPT for example is an excellent search engine that can give you a quick understanding of a topic. It’ll generate small amounts of code more quickly than I could write it by hand. Of course I’m still going to review that code to ensure it is to the same quality that hand written code would be, but overall this is still a much faster problem.
The luddites who hate on LLMs would have complained about the first compilers too, because they could write marginally faster assembly by hand.
Llama 2 70B can run on a specc-ed out current gen MacBook Pro. Not cheap hardware in any sense, but it isn’t a large data center cluster.
In this case I was referring to bandwidth and latency, which on-package memory helps with. It does make a difference in memory-intensive applications, but the majority of people would never notice a difference. Also Apple will absolutely give you a ton of memory, you just have to pay for it. They offer 128GB on the MacBook Pro, and it’s unified so the GPU has full access to it, which makes it surprisingly good for running LLMs locally, for example.
You misunderstand. This is protectionism plain and simple. US car companies are horribly inefficient. Better yet, the US car cartel eliminated most of their budget models to push trucks and SUVs that are more expensive. It doesn’t take much to undercut them, so the US government is banning the competition.
To their credit, Safari’s extension support on iOS is reasonably good. Not like Firefox good, but compared to chrome it’s excellent!
Which parts of Firefox ares proprietary?
It’s hard to say. I think it was obvious they planned to use ads and gold to break even, but it took many years to begin monetizing aggressively. Once new Reddit and the app came around, and they started making noise about an IPO, it became obvious.