- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
This is again a big win on the red team at least for me. They developed a “fully open” 3B parameters model family trained from scratch on AMD Instinct™ MI300X GPUs.
AMD is excited to announce Instella, a family of fully open state-of-the-art 3-billion-parameter language models (LMs) […]. Instella models outperform existing fully open models of similar sizes and achieve competitive performance compared to state-of-the-art open-weight models such as Llama-3.2-3B, Gemma-2-2B, and Qwen-2.5-3B […].
As shown in this image (https://rocm.blogs.amd.com/_images/scaling_perf_instruct.png) this model outperforms current other “fully open” models, coming next to open weight only models.
A step further, thank you AMD.
PS : not doing AMD propaganda but thanks them to help and contribute to the Open Source World.
CUDA has such a monopoly that - as you point out - AMD’s direct alternative is not well-supported even by AMD. OpenCL should matter, but doesn’t.
How we got here is the culmination of efforts to prevent direct competition through a series of dishonest gimmicks buoyed by pick-a-lesser-word-for-bribery of game devs. The pattern of behavior is more important than any specific act.
How Nvidia’s using this status quo is to seize a trillion-dollar bubble. Are their products several orders of magnitude better? Plainly not. So something’s fucky.
It’s so fucky that AMD tiptoes around any connection to running CUDA on their cards, because of the lawsuit Nvidia would obviously attempt. A lawsuit so complex and expensive it might ruin AMD even if they’re completely right about everything. A lawsuit you see no problem with.
Despite a precedent that recreating APIs is fine, and a precedent that withholding APIs can be monopolistic.