General consensus seems to be that llama4 was a flop. A head of meta AI research division was let go.
Do you think it was a bad fp32 conversion, or just unerwhelming models all around?
2t parameters was a big increase without much gain. If throwing compute and parameters isnt working to stay competitive anymore, how do you think the next big performance gains will be made? Better CoT reasoning patterns? Omnimodal? something entirely new?
Here is link for ollama for Gemma 3 QAT https://ollama.com/eramax/gemma-3-27b-it-qat:q4_0
There are ggufs around if you want to try it on another back end.
Thanks. I’ll try it out!