What has wallstreet in a panic is not just that #DeepSeek was built for a fraction of the cost, but that it can run locally.
Because the money was never in the AI models themselves, but in the computational power and infrastructure required to run those models.
If anyone can run a ChatGPT-comparable model on an M3 Macbook Pro, then where's the business model?
We're back to one-and-done software purchases, and that's horrible news for the corps spending hundreds of billions on infrastructure.
It's never been about the AI models because that's not where the money is
The money is in selling companies expensive computational power through bloated bundling services with hard to understand egress fees. The money is in shoehorning Internet-tethered chatbots into all of our productivity apps, then increasing our subscription fees by 20%.
The money is in the cloud. Without it, The Made in The USA business model for selling you artificial intelligence simply doesn't work.
I remain convinced it's not even about the money, at the end of the day. Those data centers and all this language crunching are about surveillance and signal jamming, only way something this useless and unprofitable ever got invested in so heavily is if it's a weapon.
@fromjason A very good explanation of the "don't sell a product, sell a service" business path.
@fromjason At this point one-and-done software applications sound like heaven. Most of us are weary and done with the whole rent-seeking miasma pushed by modern big tech. This business model needs a stake shoved straight through its ugly black heart. Corporate culture has become vampiric.
@fromjason But it makes it much more attractive to play with
@fromjason@mastodon.social just to be completely fair, while R1 is smaller than chatgpt (just 35ish% of the size!) it's not Macbook small. I was actually just looking at this myself (which is why I have the numbers off the top of my head)
R1 is ~600B parameters, which translates to about 630GB of VRAM + a few more to store context. That's in the neighborhood of 20 RTX 5090s. If you're in the datacenter, it's 10 or 11 H100s (or fewer if you get the extra high VRAM models).
If you quantize down to 8bit (90 - 95% of the performance, about half the size) you're looking at ~11 5090s. If you do a 4 bit quant, that's 80ish percent and approaching a quarter of the size or 5 - 7 5090s. Not quite macbook, but dang that's getting somewhere! It makes me cautiously excited for nv's digits and I'm not used to being excited about nvidia products
corpo AI sucks but local models have always been really cool, lots of fun stuff to play around with!