If you use natural text to train model A, and then use model A’s output, a, to train model B, then model B’s output will be less good than model A’s output. The quality degenerates with each generation, but the it happens over generations of models. So, random data is worse than AI slop, because random data is already of the lowest possible quality for AI training.
If you use natural text to train model A, and then use model A’s output, a, to train model B, then model B’s output will be less good than model A’s output. The quality degenerates with each generation, but the it happens over generations of models. So, random data is worse than AI slop, because random data is already of the lowest possible quality for AI training.
Yes, but random data might be easier to detect in the first place, and could then be filtered.