no way, cache and memory is faster than storage???
Imagine 8GB 3D V-cache. That would be glorious.
Latency enters the chat
L4 cache is still much faster than RAM.
It’s not that simple.
Having L4 at all increases latency to actual RAM.
while true, at least for gaming intel already proved it increaes performance
What use cases would this be good for?
Game Dev in 2047: “why don’t I just decompress all these textures I won’t use for a while here”
What use cases would this be good for?
Yes
PCIe rebar currently uses system RAM to VRAM communication. 3dvcache to VRAM via DMA could be made possible without even accessing RAM, this would completely eliminate 50+ ns of RAM access latency (of course the necessary data needs to be already available in 3dvcache from system RAM before any of this fancy stuff happens).
Your entire game could be in the next floor up from the CPU cores, instead of in a metaphorical different city.
I seriously dubt that main memory latency and bandwidth is the performance bottleneck for many games. Even for loading it wouldn’t be particularly useful compared to storing the game in RAM because now you’d be limited by PCIe bandwidth. Maybe with horribly optimized games that do a lot of random random reads during load it would help, but that’s pushing it. Now the GPU side on the other hand could be interesting.
What. No.
Reminds me of RAM drives, but people mostly moved on from that since SSDs have gotten so incredibly fast and cheap in the past couple of years.
Looking at Star Citizen install size now I just need a 14-socket motherboard
Time to play some old Playstation 1 RPGs with horrendous loading times all entirely stored on the L3 cache.
Imagine Factorio stored entirely in CPU
The Xeon Max CPUs contain 64GB of HBM2e, which can be configured to act as a cache. You could run a lot of games entirely on the HBM!
Xeon isn’t AMD 3D VCache
The 7995WX already has 384 MB L3 cache.
I wouldn’t be surprised if the next gen Thread ripper has 1GB L3.
Until you notice that the insane loading and save times are built into the engine and no SSD can ever change that.
I’m looking at you, Digimon World 2003.
When I’m playing old games I sometimes wonder how we ever had the patience for it. Couldn’t play them today if it wasn’t for save state’s.
That is a crime worthy of a chair with a power current flowing.
https://en.wikipedia.org/wiki/Tiny_Core_Linux
There are <100MB Linux distributions. Is it theoretically possible to run an entire operating system without RAM, purely in CPU cache?
That’s exactly what is done during bring up of new SoCs. Memory controllers are either non-functional in early prototypes or a miniature design is put into a bunch of FPGAs with only a single core and caches. The cache lines and TLB entries are primed and pinned with all relevant code and data pages before booting up a kernel.
On coreboot this boot method is called CAR, Cache as RAM, pretty interesting usage of cache to be honest, no need to add separate SRAM if you already have some
The 7995WX has 384 MB L3 cache.
Imagine what you could do with that!
I think this is also what happens at boot on most systems before RAM is initialized, so maybe boot times could be faster if they took advantage of caches getting larger?
Not sure if you meant to point out something else but initramfs or ramdisks are loaded on to RAM itself which is already up and running at that point. RAM initialization is usually initiated by early boot firmware and information about the physical address map is eventually passed on to the OS kernel which later sets up paging (virtual memory).
Xeon Max will boot without any RAM installed at all. Though I’m not sure it counts, considering it has 64gb built into the cpu.
not only possible people have done it. dunno if theyve done it on amd but some soc have.
182GB/s, for up to 32MB of data. It’s an interesting study in misusing the tech, but it’s ultimately a bit meaningless.
What we really need is for someone to modify the ramdisk driver to appear as usb storage and make it so it runs under Vista, so we can use it for ReadyBoost.
buy a copy of primocache. it’s a great piece of software that adds multi-level cache to windows
What we really need is for someone to modify the ramdisk driver to appear as usb storage and make it so it runs under Vista, so we can use it for ReadyBoost.
Use the RAM used as a ramdisk mimicking a disk drive as USB storage for Readyboost which uses a USB drive as…quasi-RAM?
This sounds like a circular way to do what RAM caching is already supposed to do haha, all modern operating systems do this already, used to call it Superfetch but now it’s just commonplace and assumed, as well as not dumping things you close out of RAM immediately in case some parts of it get reused
Why would it only be 32MB? This is the V-cache, not the L3.
32MB is what they tested on the article.
To clarify a little on what’s happening here, they’re not using the v-cache as a memory space and making the volume there as you might create a partition on a conventional disk drive, but rather, they’re accessing the ramdisk in such a way as to trick the system into keeping that it in cache. It’s almost completely impractical in real terms, but it’s a fun way to exploit the cache algorithm to get some silly numbers out of it.
It would be awesome if Simone could invent SRAM speed disks that weren’t volatile. It would be a huge step forward for PCs for many things and we would stare at CPU/GPU makers as the bottle necks.
I waiting for us to get GB amounts of cache on consumer chips like is starting to show up in enterprise/server chips.
That will be useful.
That needs a very big die.very big for cutrent mobo platforms. But I can see it being tried.making big dies and putting cache to outer borders of CPU walls. And no vchace is vertical cache it son top of die cover and chip itself. Not around it
Cache surrounding the die is probably better than stacking it on top of the die anyway. Would solve a good few of the current limitations of the X3D CPUs.
that breaks the purpose of a cache, you want a cache to reduce service times, larger caches take longer to process, it’s diminishing results after a point, they also take A LOT MORE area on the die
I remember when I tried ramdisking my modded Skyrim. It was the only way to remove cell transistion stutter, even though I had a 5950x, 64gb ram, 980pro and a 3090.
200gb+ v cache when AMD?
I would love to see the results of a 3D chip with a powerful iGPU. Not sure if it would work, but if it is possible, why is AMD not doing it? Would it cannibalise 100-200 eur GPU (they are already nonexistent anyway).
AMD is planning on the mi300 technically but that’s for enterprise and will cost tens of thousands.
We’re still very far from that. Even mobile phones don’t stack GPU, they only stack RAM and NAND. RAM and cache are far simpler to stack since they are simple things in nature. While GPU is unbelievably complicated compared to those 2. Maybe Intel’s tile / AMD’s chiplet system is closer to what we want, but it’s still not as good as stacking.
There is very little demand for a powerful iGPU desktop chip, so the ones that exist are derivatives of laptop chips and thus monolithic. So far there has not been a stacked cache monolithic die chip.
There is very little demand for a powerful iGPU desktop chip
There was little demand. Things change.
It’s easy to say there no demand for something that doesn’t exist. Sales are zero.
The laptop based desktop chips exist they are literally a thing and have been for a while. Both AMD and Intel have not seen high demand for those. Also even if that wasn’t the case, your argument is not really an argument at all since it can just be used to justify literally anything that hasn’t been tried.
I do think this will change quickly if Qualcomm’s ARM chips are as fast as the M2 Max like they claim. And there’s reason to believe it, as they’ve bought/hired Apple’s head of processor development.
Considering the M2 Max GPU is roughly equivalent to a 3080 mobile or a desktop 3060ti at significantly better efficiency, I think the demand for monolithic could explode practically overnight.
Assuming some x86 to ARM translation gets most things running.
Maybe Qualcomm would do so in the future, but as things stand now, it’s not the case.
The iGPU in the Snapdragon X Elite is on the same ballpark as the regular M2. Not the Pro or Max variant.
In 3DMARK wildlife extreme, the X Elite GPU is 50% faster than Radeon 780M.
https://youtu.be/03eY7BSMc_c?si=HbhQPDt-AN_PP_TS
Still, that’s nowhere near 3080 tier.
Qualcomm still needs to work on their Windows GPU drivers. Currently the only API the X Elite supports is DirectX12.
Some speculate that Qualcomm will eventually create Windows Vulkan driver for Adreno. And then use DXVK to support older DirectX versions, and use Zink to support OpenGL.
They already have a vulkan driver. The 3dmark runs were on vulkan
Are you talking about the Snapdragon X Elite? Sure, their mobile chips do have Vulkan drivers.
If you go to the Snapdragon X Elite Product Brief, you can see the only supported API is DX12.
lol they aren’t powerful, in this universe or any other
Igpu is not apu
Different amounts of stacked cache will be the next SKU differentiator and price gouger? Seems perfect for it.