• GrandDemand@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    This^

    SPR is cheaper than Genoa and Bergamo, and supply of those EPYC chips has not been as abundant as SPR.

    There’s advantages to SPR over Zen 4 EPYC in ML/AI workloads, and while MI300X will be doing the grunt of the training and inference, some model weights/parameters could be offloaded to the CPU with minimal performance loss in the event the VRAM buffer overflows to system memory. CPU only inference could also tested for model performance on weaker hardware or be utilized if all MI300X are busy and there’s unused CPU cycles (which is likely for these workloads). SPR generally outperforms Genoa in inference so there’s some merits for its selection over the latter.

    Regardless though this decision by Microsoft just boils down to cost and availability