Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares

xiao · 1 month ago

Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares

Womble@lemmy.world · edit-2 1 month ago

https://www.analyticsvidhya.com/blog/2024/12/deepseek-v3/

Huh I guess 6 million USD is not millions eh? The innovation is it’s comparatively cheap to train, compared to the billions OpenAI et al are spending (and that is with acquiring thousands of H800s not included in the cost).

Edit: just realised that was for the wrong model! but r1 was trained in the same budget https://x.com/GavinSBaker/status/1883891311473782995?mx=2

UnderpantsWeevil@lemmy.world · edit-2 1 month ago

The innovation is it’s comparatively cheap to train, compared to the billions

Smaller builds with less comprehensive datasets take less time and money. Again, this doesn’t have to be encyclopedic. You can train your model entirely on a small sample of material detailing historical events in and around Beijing in 1989 if you are exclusively fixated on getting results back about Tienanmen Square.

Womble@lemmy.world · edit-2 1 month ago

Oh, by the way, as to your theory of “maybe it just doesnt know about Tiananmen, its not an encyclopedia”…

Dhs92@programming.dev · 1 month ago

I don’t think I’ve seen that internal dialog before with LLMs. Do you get that with most models when running using ollama?

Womble@lemmy.world · edit-2 1 month ago

No it’s not a feature of ollama, thats the innovation of the “chain of thought” models like OpenAI’s o1 and now this deepseek model, it narrates an internal dialogue first in order to try and create more consistent answers. It isnt perfect but it helps it do things like logical reasoning at the cost of taking a lot longer to get to the answer.

Womble@lemmy.world · 1 month ago

Ok sure, as I said before I am grateful that they have done this and open sourced it. But it is still deliberately politically censored, and no “Just train your own bro” is not a reasonable reply to that.

Rai@lemmy.dbzer0.com · 1 month ago

They know less than I do about LLMs of that’s something they think you can just DO… and that’s saying a lot.