This is again a big win on the red team at least for me. They developed a “fully open” 3B parameters model family trained from scratch on AMD Instinct™ MI300X GPUs.

AMD is excited to announce Instella, a family of fully open state-of-the-art 3-billion-parameter language models (LMs) […]. Instella models outperform existing fully open models of similar sizes and achieve competitive performance compared to state-of-the-art open-weight models such as Llama-3.2-3B, Gemma-2-2B, and Qwen-2.5-3B […].

As shown in this image (https://rocm.blogs.amd.com/_images/scaling_perf_instruct.png) this model outperforms current other “fully open” models, coming next to open weight only models.

A step further, thank you AMD.

PS : not doing AMD propaganda but thanks them to help and contribute to the Open Source World.

  • Zarxrax@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    3
    ·
    13 hours ago

    And we are still waiting on the day when these models can actually be run on AMD GPUs without jumping through hoops.

    • foremanguy@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 hours ago

      That is a improvement, if the model is properly trained with rocm it should be able to run on amd GPU easier

    • grue@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      2
      ·
      10 hours ago

      In other words, waiting for the day when antitrust law is properly applied against Nvidia’s monopolization of CUDA.