• From-UoM@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Surely it cant be due to be AI performance?

    The 7900xt is 103 tlops of fp16, 7900xtx is 122.

    the 4070 is at 117 of fp16 (234 using sparsity) on a smaller chip and thats not banned.

    • meshreplacer@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Apparently the AMD significantly outperforms Nvidia in specific calculations used for nuclear weapons simulation software.

      • GomaEspumaRegional@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        AMD has had traidionally very competitive FLOPs with their shaders. The issue is that their software stack, for lack of a better word is; shit.

        For specific customers, like national labs or research institutions, they can afford to pay a bunch of poor bastards to develop some of the compute kernels using the shitty tools. Because at the end of the day, most of their expenses are in terms of electricity and hardware, with salaries not being the critical cost for some of these projects. I.e. grad students are cheap!

        However, when it comes to industry, things are a bit difference. First off, nobody is going to take a risk w a platform with little momentum behind it. Also they need to have access to talent pool that can develop and get the applications up and running as soon as possible. Under those scenarios, salaries (i.e. the people developing the tools) tend to be almost as important consideration as the HW. So you go with the vendor that gives you the biggest bang for your buck in terms of performance and time to market. And that is where CUDA wins hands down.

        At this point AMD is just too behind, at least to get significant traction in industry.

      • From-UoM@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Amd is better at fp32 and FP64

        During 2017 ish Nvidia and Amd focused on different parts with data centre cards.

        Amd went in on Compute with fp32 and fp64.

        Nvidia went full in on AI with Tensor cores and fp16 performance.

        Amd got faster than Nvidia in some tasks. But Nvidia’s bet on AI is the clear winner.

    • RedTuesdayMusic@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      The only thing 7900XTX/ W7900 beat the 4090 in is RAW video debayering in DaVinci Resolve that I’m aware of

    • upbeatchief@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I know that the xtx kept up with the 4090 in stable diffusion before the tensorRT update, so there might be some places where the xtx can be a replacement when you build software from the grounds up and willing to lose performance for the benefit of less eyes and hassle on Amd products

    • virtualmnemonic@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      RDNA3 is lacking in AI performance today, but there’s no real reason to believe it can not compete if given billions for software development. The specs are there, but the software (in comparison to NVIDIA) is in a laughable state. For now.

      • From-UoM@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        People also have the misconception that cuda is the only software advantage.

        Their AI foundries and AI Enterprise. are their biggest AI software and support.

        Jensen at Microsoft Ignite told Satya that they want be the TSMC of AI.

        Just like cpu/gpu makers use tsmc foundries to make chips,

        Companies will use Nvidia foundries like Nemo, bionemo, picaso, etc to make AI models.

        In addition there is their Omniverse and DGX Cloud.

        DGX cloud even allows them to straight up bypass any restrictions and let chinese customers use Hopper chips remotely.

        • dotjzzz@alien.topB
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          told Satya that they want be the TSMC of AI.

          That’s just a pipedream. They are peacocking and it’s obviously failing since Microsoft shat directly in their face with Maia.

          Nvidia’s ecosystem advantages will only diminish over the years since Microsoft, Google and Amazon etc will develop their own.

          This is Glide vs Direct3D all over again. You know which one won.

          • From-UoM@alien.topB
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            You do know they have already started it right?

            Adobe for example uses Nvidia Foundry for their AI Foundry.

            They have been building these foundries for years now. Before even ai got popular and Microsoft jumped on Open AI

    • dine-and-dasha@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      The law only restricted raw FLOPs, so it has to be that. But the law has a chiplet subclause so it might be there’s some interaction there that pushes the AMD gpus over the edge.

      • From-UoM@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        the 4070 ti at 294mm2 (full ad104) with 160 Tflops of Fp16

        The 7900xtx GCD is 300 mm2 (Full Navi31 GCD only) with 122 tflops of Fp16

        Doubt its that.

        Where there might be reasons is that RDNA doesnt hasve AI cores. The tasks are accelerated on the shader cores.Hence the term AI Accelarators. Now assumming nvidia cards ignore the tensor cores.

        The 4090 can do only 82.6 Tflop of FP16 (Non-Tensor).

        The 7900xtx would still retain its 122 tflops of FP16. making it faster in Fp16 performance.

        • TwanToni@alien.topB
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          doesn’t RDNA3 have WAVA MMA or Wave Matrix Multiply Accumulate which is their AI cores?

          • From-UoM@alien.topB
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            It has the instruction sets in the compute units

            They are called AI accelerators for that reason.

            Not Ai cores.

            The actual Matrix “Cores” , i.e. dedicated silicon, are on the instinct series

          • dotjzzz@alien.topB
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            No. Tensor cores have seperate specialised matrix ALUs, AMD’s WMMA are instructions on existing shader ALUs.

            Tensor cores can process AI tasks in parallel to CUDA cores, RDNA3 can’t do both on the same CU.

        • Qesa@alien.topB
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          The actual rule has hard numbers, no need to speculate. And it’s no more than 300 TFLOPS of fp16 (or 150 fp32, 600 fp8, etc) so it ain’t TFLOPS that are the culprit. As for performance density, it’s equivalent to those figures at an 830mm^2 die, so again not that.

          • dine-and-dasha@alien.topB
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Ok I didn’t know the actual numbers that’s helpful. Maybe they’re just holding off to apply for an export license? I heard the 4090 is in a “gray area”.

            • f3n2x@alien.topB
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              No gray area, at base clocks the 4090 exceeds the limit by 10% already.