to echo others, it’s not what, but how.
cpus do execution reordering and speculation to run one thread really fast. gpus have mostly avoided that and execute threads in large groups called “warps” (analogous to lanes of a SIMD unit).
to echo others, it’s not what, but how.
cpus do execution reordering and speculation to run one thread really fast. gpus have mostly avoided that and execute threads in large groups called “warps” (analogous to lanes of a SIMD unit).
this has been my take, it’s an obvious case of the 80-20 rule. During the times of breakthrough/flux, NVIDIA benefits from having both the research community onboard as well as a full set of functionality and great tooling etc. when things slow back down you’ll see google come out with a new TPU and amazon will have a new graviton etc.
it’s not that hard in principle to staple an accelerator to an ARM core, actually that’s kind of a major marketing point for ARM. And nowadays you’d want an interconnect too. There are a decently large number of companies who can sustain such a thing at reasonably market-competitive prices. So once the market settles, the margins will decline.
On the other hand, if you are building large, training-focused accelerators etc… it is also going to be a case of convergent evolution. In the abstract, we are talking about massively parallel accelerator units with some large memory subsystem to keep them fed, and some type of local command processor to handle the low-level scheduling and latency-hiding. Which, gosh, sounds like a GPGPU.
If you are giving it any degree of general programmability then it just starts to look very much like a GPU. If you aren’t, then you risk falling off the innovation curve the next time someone has a clever idea, just like previous generations of “ASICs”. And you are doing your tooling and infrastructure and debugging all from scratch too, with much less support and resources. GPGPU is turnkey at this stage, do you want your engineers building CUDA or do you want them building your product?
that’s what I said, the memory bandwidth is already baked into the numbers you see. the cache increases mean that you don’t need as much actual memory bandwidth - it’s the same thing AMD did with RDNA2.
AMD reduced the memory bus by 25% on the 6700XT relative to its predecessor and 33% on the 6600XT relative to its predecessor, so, if you think that will cause those cards to age more poorly…
4060 Ti is objectively a bad product.
3060 Ti is literally better.
it literally is not, 4060 Ti is 11% faster at 1080p, 9% faster at 1440p, and 6% faster at 2160p.
the reduced memory bandwidth is already baked into these performance figures, and apart from some edge-cases like emulated PS3/wii at 16K resolution the 4060 Ti is still generally a faster card. not that much faster, but, it’s not slower either.
Also no idea why Steve said the 6800XT is faster than the 7800XT? Its clearly not.
i’d prefer to look at the meta-reviews rather than any one reviewer or any one set of games, but, yea, you’re right, 7800XT is ~5% faster than 6800XT, it is factually incorrect to say it’s slower.
I don’t know why everyone seems to have collectively decided that it’s slower, same for the 4060 and 4060 Ti which are both faster than the 3060 and 3060 Ti (respectively) at relevant resolutions. Maybe not as much faster as people would like to see, but they literally perform faster in spite of the memory bandwidth reductions etc.
ironically COD did the “cold war gone hot” thing not too long ago, lol
I actually think smaller-scale conflicts would be a good fit for battlefield gameplay. the series has eternally struggled to balance aircraft, having jet fighters boom-and-zoom and go repair in the endzone where they’re untouchable is no good, they just don’t couple to the battlefield very well. even tanks/helicopters have risk, but, planes just fly away and go repair. and if you make them weaker then they’re not any good.
(it’s very similar to sniper rifles in the sense that sniper rifles either 1-hit you and then they’re not fun for anyone else, or they require multiple hits and then that’s just not good compared to DMR/etc which allow you to spam shots and achieve generally lower TTKs on average if you assume one or two misses.)
but if you do smaller-scale conflicts, then airplanes can be older slower stuff like harriers or a-6 intruders, or propeller aircraft, and helicopters, etc. if planes can’t just disappear over the battlefield in 5 seconds flat, then that’s more of a chance for people on the ground to actually coordinate against them and gets you away from the “sniper-rifle problem”.
we’ve seen some people modding cards to clamshell mode, apparently there is nothing burned into the core itself that determines whether it’s a quadro or a 4090, or whether a 4090 should have 24GB or 48GB, just a resistor array on the PCB itself. So if you resolder it onto a new PCB with twice the RAM chips, you can make it a “4090 48GB”, or even make it into a quadro.
this has been around for a while, I remember people doing this to turn 780s into titans/780 tis into titan black, but normally they weren’t adding more RAM capacity, just trying to get the ECC working and stuff, they’d just mod a couple resistors and boom it reports as quadro.
In some senses you end up with convergent design, it’s not a GPU, it’s just a control system that commands a bunch of accelerator units with a high-bandwidth memory subsystem. But that could be ARM and an accelerator unit etc. Probably need fast networking.
But it’s overall a crazy proposition to me. Like first off goog and amazon are gonna beat you to market on anything that looks good, and you have no real moat other than “I’m sam altman”, and really there’s no market penetration of the thing (or support in execution let alone actual research) etc. Training is a really hard problem to solve because right now it’s absolutely firmly rooted in the CUDA ecosystem. Supposedly there may be a GPU Ocelot thing once again at some point but like, everyone just works with nvidia because they’re the gpgpu ecosystem that matters.
Like, if you wanted to do this you did like Tesla and have Jim Keller design you a big fancy architecture for training fast at scale (Dojo). I guess they walked away from it or something and just didn’t care anymore? Oops.
But, that’s the problem, it’s expensive to stay at the cutting edge. It’s expensive to get the first chip, and you’ll be going against competitors who have the scale to make their own in-house anyway. it’s a crazy business decision to be throwing yourself on the silicon treadmill against intense competition just to give nvidia the finger. wack, hemad.
I was waiting for this to resurface in my recommendations
It’s not just about what you need today, it’s also about what you need in a couple years.
I think this is a real tough argument even in the high-end monitor market. isn’t your $700 or $1200 or $2500 or $3500 going to get you more in 2 years if you wait?
why not wait to see what the monitor market has to offer when nvidia has cards to drive them?
it literally is the ironic mirror image of AMD’s tech holding back the consoles. just a funny coincidence of fate, funny reversal.
What is different about AMD’s W7000 such that they can offer higher DP standards support than the RX 7000 consumer cards?
there’s barely even any monitors anyway.
it’s like nvidia and the consoles: AMD can do whatever they want but the market penetration isn’t there until nvidia is onboard. Monitors are a low-margin high-volume business and you can’t support an advanced product that tops out at 10% addressable market.
Let alone when that brand’s customers are notoriously “thrifty”…
all of these processors were utterly wiped out by the “spend $100 more on a 8700K and overclock” option.
There is such a thing as false economy, sometimes spending more money results in a thing that lasts long and gives better results throughout that whole timespan… classic “boots theory” stuff.
Having your $2000 build last 2-3 years less than it otherwise could have because you didn’t spend $100 more on the 8700K when you had the option to, is stupid, and not good value. Reviewers over-fixated on the minmaxing, the order of the day was “cut out everything else and shunt it all into your GPU”, some reviewers took it as far as saying you should cut down to a 500W PSU or even less. And today that $100 extra you spent on your GPU is utterly wasted, while going from a 8600K to 8700K or buying a PSU that doesn’t cut out on transient loads got you a much better system in the long term, even if it didn’t win the benchmarks on day 1.
(and yes, transients were already problematic back then, and in fact have remained pretty constant at around 2x average power draw…)
gcc and blender and openfoam aren’t real world, but cinebench r15 definitely is
it absolutely never was, the real bottom for 3600 was $160 ish and then the pandemic and 5000-series hit and prices went up like crazy.
he is thinking of 1600AF for sure
The incremental cost of the 7xxx series is a lot larger.
the incremental cost of not buying RAM now while it’s cheap is going to be a lot larger too.
how long do you really want to be using that shitty 2018-era 16GB kit? are you comfortable riding it for another 2-3 years until RAM prices finally come down again?
people are about to learn a hard lesson about buying things when they are cheap. RAM prices in 2017 were triple what they were in 2016 (and not shitty kits either, by that point it was 3000C15/3200C16 kinda stuff).
it would be crazy to take the foot off the gas in CPU/SOC side while intel is imploding. AMD inherits the x86 market by default. NVIDIA is on top of their game in a product segment that AMD has neglected their investments in for a long time. Yeah, do whatever the consoles want, ship the MVP of that as a dGPU and stick it in your iGPUs.
GA102 to AD102 increased by about 80%, but the jump from Ad102 to GB202 is only slightly above 30%,
Maybe GB202 is not the top chip, and the top chip is named GB200.
I mean, you’d expect this die to be called GB102 based on the recent numbering scheme, right? Why jump to 202 right out of the gate? They haven’t done that in the past, AD100 is the compute die and AD102, 103, 104… are the gaming dies. In fact this has been extremely consistent all the way back to Pascal, even when there is a compute uarch variant that is different (and, GP100 is quite different from GP102 etc) it’s still called the 100.
But if there is another die above it, you’d call it GB100 (like Maxwell GM200, or Fermi GF100). Which is obviously already taken, GB100 is the compute die. So you bump the whole numbering series to 200, meaning the top gaming die is GB200.
There is also precedent for calling the biggest gaming die the x110, like GK110 or the Fermi GF110 (in the 500 series). But they haven’t done that in a long time, since Kepler. Probably because it ruins the “bigger number = smaller die” rule of thumb.
Of course it’s possible the 512b rumor was bullshit, or this one is bullshit. But it’s certainly an odd flavor of bullshit - if you were making something up, wouldn’t you make up something that made sense? Odd details like that potentially lend it credibility, because you’d call it GB102 if you were making it up. It will also be easy to corroborate across future rumors, if nobody ever mentions GB200-series chips again, then this was probably just bullshit, and vice versa. Just like Angstronomics and the RDNA3 leak, once he’d nailed the first product the N32/N33 information was highly credible.
Talk about burying the lede in the last segment. Asus isn’t using the official connector and every other vendor thinks their connector is risky and probably defective. That’s not on nvidia, other than allowing it (and this is the reason why they ride partners’ asses sometimes on approval/etc).
The rest of the stuff is Igor still grinding the same old axe (pretty sure astron knows how to make a connector, if the connector is so delicate it would be broken by GN’s physical testing, etc) but if asus isn’t using the official connector and they’re disproportionately making up a huge number of the failures, that’s really an asus problem.