It needs be 10% slower in fp8 and fp16 theoretical performance.
Basically reducing the the clocks from 2.5ghz to 2.25 should do the trick.
It needs be 10% slower in fp8 and fp16 theoretical performance.
Basically reducing the the clocks from 2.5ghz to 2.25 should do the trick.
Well for starters the memory controllers are smaller on the series s.
Its also 8 cores vs 4 cores.
The Series S also packs 24 CU. With 20 enabled for gaming. The steam deck has 8 CU
For reference the series x has 52 out of 56 enabled. the Ps5 has 36 out of 40 enabled.
The battery life is quite horrendous. Bizarre not to offer usb c charging
Just web browsing lasted 85 minutes
The same model with an i9 13950hx and 4090 did 260 minutes.
As MILD mentioned RDNA 4 will come out around Q3 2024.
Stopped reading right there
Nobody knows the actual flops of the mi300
The mi250x had 95.7 tflops of fp32 due the matrix cores
https://www.amd.com/en/products/server-accelerators/instinct-mi250x
That’s more than the H100 even
Amd is better at fp32 and FP64
During 2017 ish Nvidia and Amd focused on different parts with data centre cards.
Amd went in on Compute with fp32 and fp64.
Nvidia went full in on AI with Tensor cores and fp16 performance.
Amd got faster than Nvidia in some tasks. But Nvidia’s bet on AI is the clear winner.
Can you believe some people actually defend it?
You already have situations where 7520u (zen2) is slower than 5600u (zen3)
Soon you will get a R7 9730 slower than R7 7740
Or a 9520u being lower than 7540u
It still does.The 5000 is still better than 3000.
That’s just facts. Also its a massive change that in the 5000 series, both zen 2 and zen3 are made on the same 7nm.
You cannot say the same for new naming.
7020 and 7030 is on 7nm processes (6nm and 7nm). The exact same as ryzen 5000 and 6000
7040 is on 5nm processes (4nm and 5nm)
They don’t belong in the same category at all. The nodes make it quite obvious. They are selling you old and significantly less efficient nodes.
The name is a way to sell old zen2/zen3 cpus as brand new.
Why do you think this is only for laptops and desktop haven’t gotten this?
Because you average consumer mostly buys a laptop over desktop and they will have no idea.
Yes.
R7 5800 8 core > R7 5700 8core
R5 5600 6 core > R5 5500 6 core
R3 5400 4 core > R3 5300 4 core
You quite literally gave the best answer of why older names were better.
You think the general public will know?
It has always been first higher number representing latest generation.
Not the 3rd
A 9840h would be slower than a 9750H
I had to look up that chart yo name them. That’s how bad it is.
8 stands for 2024 btw.
The 4 means zen 4
Thank you AMD for a totally not confusing naming scheme for laptops.
You do know they have already started it right?
Adobe for example uses Nvidia Foundry for their AI Foundry.
They have been building these foundries for years now. Before even ai got popular and Microsoft jumped on Open AI
People also have the misconception that cuda is the only software advantage.
Their AI foundries and AI Enterprise. are their biggest AI software and support.
Jensen at Microsoft Ignite told Satya that they want be the TSMC of AI.
Just like cpu/gpu makers use tsmc foundries to make chips,
Companies will use Nvidia foundries like Nemo, bionemo, picaso, etc to make AI models.
In addition there is their Omniverse and DGX Cloud.
DGX cloud even allows them to straight up bypass any restrictions and let chinese customers use Hopper chips remotely.
It has the instruction sets in the compute units
They are called AI accelerators for that reason.
Not Ai cores.
The actual Matrix “Cores” , i.e. dedicated silicon, are on the instinct series
The gap would be even larger if, or to be precise WHEN, Fp8 and/or sparisity will be used on the Ada Lovelace cards.
You cant compare using using two different impelementations. You compare only on A1111 or only on SHARK.
SHARK doesnt even seem be taking any adavantage of the 4090 being significatly slower than the 7900xtx.
The recent A1111 Olive branch made the performance of it almost equal SHARK model. A1111 also full uses the 4090.
The new results on the same A1111 implention are here -
You can divide the 4090’s perf by half if you want no Tensor RT which is 35. Thats still significantly higher than the 7900xtx’s 23
the 4070 ti at 294mm2 (full ad104) with 160 Tflops of Fp16
The 7900xtx GCD is 300 mm2 (Full Navi31 GCD only) with 122 tflops of Fp16
Doubt its that.
Where there might be reasons is that RDNA doesnt hasve AI cores. The tasks are accelerated on the shader cores.Hence the term AI Accelarators. Now assumming nvidia cards ignore the tensor cores.
The 4090 can do only 82.6 Tflop of FP16 (Non-Tensor).
The 7900xtx would still retain its 122 tflops of FP16. making it faster in Fp16 performance.
Got a source for that keeping up?
64 bit is quite litterally 2^32 times larger than 32 bit.
There isn’t a need to go to 128 bit yet