cross-posted from: https://lemmy.intai.tech/post/72919

Parameters count:

GPT-4 is more than 10x the size of GPT-3. We believe it has a total of ~1.8 trillion parameters across 120 layers. Mixture Of Experts - Confirmed.

OpenAI was able to keep costs reasonable by utilizing a mixture of experts (MoE) model. They utilizes 16 experts within their model, each is about ~111B parameters for MLP. 2 of these experts are routed to per forward pass.

Related Article: https://lemmy.intai.tech/post/72922

  • damnYouSun
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Wasn’t there some newspapers that said that it would take one million years before humans could fly about two weeks before the Flyer?

    Hell, we have gone from hunter gatherers to a technologically advanced society in less time than that. The moral of the story being journalists are idiots and should be ignored.