It started on consumer hardware, but the models that are supposedly good enough to replace an employee are the ones that took billions to develop.
Can you get something useful on consumer hardware? Probably. Is it world changing, enough to cause developers to lose their jobs? Maybe? But, it seems unlikely to me.
DeepSeek claims they spent six million, are they actually accounting for everything they spent, or are they trying to make it seem like they spent less than they really did?
Oh no, what if they only saved two orders of magnitude, instead of three?
Google knew smaller was better a decade ago. They did AlphaGo with all the data they could find, and a supercomputer famously beat Lee Sedol. A year later they did AlphaGo Zero, one-tenth the size, trained only by playing, and it reliably beat AlphaGo. A year later, they did AlphaZero, one-tenth that size, and it played every board game at a higher level than leading engines. A year later, MuZero played Atari 2600 games, just by looking at the screen.
OpenAI would not be shitting their pants and trying to make DeepSeek R1 illegal if they still thought bigger meant better.
It started on consumer hardware, but the models that are supposedly good enough to replace an employee are the ones that took billions to develop.
Can you get something useful on consumer hardware? Probably. Is it world changing, enough to cause developers to lose their jobs? Maybe? But, it seems unlikely to me.
OpenAI spent billions. DeekSeek spent six million. A lot of whitepapers are about making that number shrink.
Model size is not a measure of power, either. Current desktop stuff beats last year’s big boys. Small models train faster.
Whether that’s ever enough to let any dipshit program professionally remains to be seen. But I didn’t think we’d get this far.
DeepSeek claims they spent six million, are they actually accounting for everything they spent, or are they trying to make it seem like they spent less than they really did?
Oh no, what if they only saved two orders of magnitude, instead of three?
Google knew smaller was better a decade ago. They did AlphaGo with all the data they could find, and a supercomputer famously beat Lee Sedol. A year later they did AlphaGo Zero, one-tenth the size, trained only by playing, and it reliably beat AlphaGo. A year later, they did AlphaZero, one-tenth that size, and it played every board game at a higher level than leading engines. A year later, MuZero played Atari 2600 games, just by looking at the screen.
OpenAI would not be shitting their pants and trying to make DeepSeek R1 illegal if they still thought bigger meant better.