frezik

joined 1 year ago
[–] frezik@midwest.social 1 points 6 months ago* (last edited 6 months ago)

Improving the models doesn't seem to work: https://arxiv.org/abs/2404.04125?

We comprehensively investigate this question across 34 models and five standard pretraining datasets (CC-3M, CC-12M, YFCC-15M, LAION-400M, LAION-Aesthetics), generating over 300GB of data artifacts. We consistently find that, far from exhibiting "zero-shot" generalization, multimodal models require exponentially more data to achieve linear improvements in downstream "zero-shot" performance, following a sample inefficient log-linear scaling trend.

It's taking exponentially more data to get better results, and therefore, exponentially more energy. Even if something like analog training chips reduce energy usage ten fold, the exponential curve will just catch up again, and very quickly with results only marginally improved. Not only that, but you have to gather that much more data, and while the Internet is a vast datastore, the AI models have already absorbed much of it.

The implication is that the models are about as good as they will be without more fundamental breakthroughs. The thing about breakthroughs like that is that they could happen tomorrow, they could happen in 10 years, they could happen in 1000 years, or they could happen never.

Fermat's Last Theorem remained an open problem for 358 years. Squaring the Circle remained open for over 2000 years. The Riemann Hypothesis has remained unsolved after more than 150 years. These things sometimes sit there for a long, long time, and not for lack of smart people trying to solve them.

[–] frezik@midwest.social 2 points 6 months ago (1 children)

Do you know if the model is running locally or some cloud shit? If locally, the actual energy usage may be modest.

Energy spent training the model initially may have been prohibitive, though.

[–] frezik@midwest.social 2 points 6 months ago

I didn't specify either way, and I'm sorry for assuming.

[–] frezik@midwest.social 2 points 6 months ago (2 children)

Now try it in freezing weather, and account for 70 miles in both directions. And you don't want to actually use the entire range, but rather sit in the 10-80% marks. No, none of them could.

I'll criticize Tesla for actual reasons.

[–] frezik@midwest.social 3 points 6 months ago

Probably, yes.

My wife goes to work and back on a Mini EV, which is around 110 mi range. Basically a BMW i3 dropped into the chassis of a Cooper S. It's not suitable for road trips in the US. If L3 charging was a little more reliable, you could almost do it, but it would still suck and I wouldn't choose to do it except in a pinch.

[–] frezik@midwest.social 3 points 6 months ago* (last edited 6 months ago) (4 children)

My wife needs to run on a ~70 mile trip about once a week to help their mom, often in freezing temperatures. An EV reasonably capable of that didn't really exist outside of Tesla until the last few years.

Please stuff the "kill all the children in the crosswalk" nonsense. It doesn't help anything. Until the Cybertruck, Tesla didn't even offer anything like that.

[–] frezik@midwest.social 6 points 6 months ago* (last edited 6 months ago) (2 children)

None of those had close to the range of the Model X in 2015. Having less than 200mi range makes things difficult. Doubly so because the charging infrastructure wasn't there (and barely is now). The infrastructure that did exist was put there by Tesla.

Though with proper charging infrastructure, having more than 400mi isn't really necessary, and is almost silly.

[–] frezik@midwest.social 2 points 6 months ago (2 children)

It's not a solution by itself, but a library economy can form part of it: https://www.youtube.com/watch?v=NOYa3YzVtyk

[–] frezik@midwest.social 5 points 6 months ago (4 children)

OK, let's just get rid of cars altogether, then.

[–] frezik@midwest.social 20 points 6 months ago* (last edited 6 months ago) (1 children)

Perhaps they'd like to rollback all the times we've bailed out the auto industry. We don't want the government to be choosing winners and losers, after all.

[–] frezik@midwest.social 6 points 6 months ago

Let me guess, hardened steel? Because that's how you keep your bike from getting stolen in New York. Kryptonite calls it the "fahgettaboudit" lock for a reason.

[–] frezik@midwest.social 2 points 6 months ago

If you want to get started in machine learning cheap and want something faster than cpu training, a 1080ti goes for $120 or so on ebay.

view more: ‹ prev next ›