• Soyweiser@awful.systems
    link
    fedilink
    English
    arrow-up
    20
    ·
    5 months ago

    Yes, we know (there are papers about it) that for LLMs every increase of capabilities we need exponentially more data to train it. But don’t worry, we only consumed half the worlds data to train LLMs, still a lot of places to go ;).

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      5 months ago

      That doesn’t appear to actually be the case, though. LLMs have been improving greatly through the use of a smaller amount of higher-quality data, some of it synthetic data that’s been generated in part by other LLMs. Turns out simply dumping giant piles of random nonsense from the Internet on a neural net doesn’t produce the best results. Do you have references to any of those papers you mention?

      Necro-edit: NVIDIA just released an LLM that’s specifically designed to generate training data for other LLMs, as a concrete example of what I’m talking about.