• barsoap@lemm.ee
    link
    fedilink
    arrow-up
    5
    ·
    edit-2
    6 months ago

    Also, you need a supported card. I have a potato going by the name RX 5500, not on the supported list. I have the choice between three rocm versions:

    1. An age-old prebuilt, generally works, occasionally crashes the graphics driver, unrecoverably so… Linux tries to re-initialise everything but that fails, it needs a proper reset. I do need to tell it to pretend I have a different card.
    2. A custom-built one, which I fished out of a docker image I found on the net because I can’t be arsed to build that behemoth. It’s dog-slow, due to using all generic code and no specialised kernels.
    3. A newer prebuilt, any. Works fine for some, or should I say, very few workloads (mostly just BLAS stuff), otherwise it simply hangs. Presumably because they updated the kernels and now they’re using instructions that my card doesn’t have.

    #1 is what I’m actually using. I can deal with a random crash every other day to every other week or so.

    It really would not take much work for them to have a fourth version: One that’s not “supported-supported” but “we’re making sure this things runs”: Current rocm code, use kernels you write for other cards if they happen to work, generic code otherwise.

    Seriously, rocm is making me consider Intel cards. Price/performance is decent, plenty of VRAM (at least for its class), and apparently their API support is actually great. I don’t need cuda or rocm after all what I need is pytorch.