Promising increase in context, obviously we’ve seen other methods like yarn and rope scaling, but nice to see Meta validating some methods and hopefully releasing the models themselves!

  • @[email protected]
    link
    fedilink
    English
    2
    edit-2
    9 months ago

    Lol. In 4.1 they mention “the Reddit r/LocalLLaMa community”…

    But I guess the achievements regarding context scaling will be more influential than some of the other news of the recent days.

    Also interesting their proposed Long Llama notably outperforms the usual one, even for short tasks.