• brucethemoose@lemmy.world
    cake
    link
    fedilink
    English
    arrow-up
    1
    ·
    4 days ago

    In AI land, programmability has proven to be king so far.

    Nvidia GPUs are so dominant because everything is prototyped, then deployed, in PyTorch. I think Google TPUs are a good counter example where a big entity throws tons of money at this issue, and even releases some models optimal for their hardware, yet Flax and TPUs themself gain very little traction and are still incompatible with the new architectures that come out every other day because no one has bothered to port them.

    FPGAs take this problem and make it an order of magnitude worse. Very few people know how to port and optimize to them… So they don’t.

    • j4k3@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 days ago

      I really wish I had saved the reference of the guy from Altera explaining why FPGAs simply will not work for models. I don’t have a ton of interest in FPGAs in general… It may have been on Lex Friedman.

      The FPGA is ultimately anything, and they work for smaller stuff obviously, but there is some specific reason why they do not scale to work with current models. It might have been the way rotations are done in transformers or something like that. The person was on papers I have seen and skimmed on arxiv. Names and details outside of my curiosity do not stick in my abstractive functional thought and memory.