There might actually be nothing bad about the Torment Nexus, and the classic sci-fi novel “Don’t Create The Torment Nexus” was nonsense. We shouldn’t be making policy decisions based off of that.
Yes, we know (there are papers about it) that for LLMs every increase of capabilities we need exponentially more data to train it. But don’t worry, we only consumed half the worlds data to train LLMs, still a lot of places to go ;).
That doesn’t appear to actually be the case, though. LLMs have been improving greatly through the use of a smaller amount of higher-quality data, some of it synthetic data that’s been generated in part by other LLMs. Turns out simply dumping giant piles of random nonsense from the Internet on a neural net doesn’t produce the best results. Do you have references to any of those papers you mention?
You sound very confident of that. Have you tried it?
wild
You’re quoting an out-of-context fragment of a comment I made in a completely different thread in a different community, and adding nothing to it. I have no idea what you’re trying to say here.
Yes, we know (there are papers about it) that for LLMs every increase of capabilities we need exponentially more data to train it. But don’t worry, we only consumed half the worlds data to train LLMs, still a lot of places to go ;).
That doesn’t appear to actually be the case, though. LLMs have been improving greatly through the use of a smaller amount of higher-quality data, some of it synthetic data that’s been generated in part by other LLMs. Turns out simply dumping giant piles of random nonsense from the Internet on a neural net doesn’t produce the best results. Do you have references to any of those papers you mention?
Necro-edit: NVIDIA just released an LLM that’s specifically designed to generate training data for other LLMs, as a concrete example of what I’m talking about.