Promising increase in context, obviously we’ve seen other methods like yarn and rope scaling, but nice to see Meta validating some methods and hopefully releasing the models themselves!
You must log in or register to comment.
Lol. In 4.1 they mention “the Reddit r/LocalLLaMa community”…
But I guess the achievements regarding context scaling will be more influential than some of the other news of the recent days.
Also interesting their proposed Long Llama notably outperforms the usual one, even for short tasks.