The new $3000 NVidia Digit has 128 GB of fast RAM in an Apple-M4-like unified-memory configuration, reportedly. NVidia claims it is twice as fast as an apple stack at least at inference. Four of these stacked can run a 405B model, again according to NVidia.
In my case I want the graphics power of an GPU and VRAM for other purposes as well. So I’d rather buy a graphics card. But regarding a 90B model, I do wonder if it is possible with two A6000 at 64 GB and a 3 bit quant.
Huh so basicly sidestepping the gpu issue entirly and essentially just using some other special piece of silicon with fast (but conventional ram). I still dont understand why u cant distribute a large llm over many different processors each holding a section of the parameters in memory.
Not exactly. Digits still uses a Blackwell GPU, only it uses unified RAM as virtual VRAM instead of actual VRAM. The GPU is probably a down clocked Blackwell. Speculation I’ve seen is that these are defective and repurposed Blackwells; good for us. By defective I mean they can’t run at full speed or are projected to have the cracking die problem, etc.
The new $3000 NVidia Digit has 128 GB of fast RAM in an Apple-M4-like unified-memory configuration, reportedly. NVidia claims it is twice as fast as an apple stack at least at inference. Four of these stacked can run a 405B model, again according to NVidia.
In my case I want the graphics power of an GPU and VRAM for other purposes as well. So I’d rather buy a graphics card. But regarding a 90B model, I do wonder if it is possible with two A6000 at 64 GB and a 3 bit quant.
Huh so basicly sidestepping the gpu issue entirly and essentially just using some other special piece of silicon with fast (but conventional ram). I still dont understand why u cant distribute a large llm over many different processors each holding a section of the parameters in memory.
Not exactly. Digits still uses a Blackwell GPU, only it uses unified RAM as virtual VRAM instead of actual VRAM. The GPU is probably a down clocked Blackwell. Speculation I’ve seen is that these are defective and repurposed Blackwells; good for us. By defective I mean they can’t run at full speed or are projected to have the cracking die problem, etc.