• @mindbleach
    link
    English
    32 months ago

    On the other hand, I do hope that between now and then, some laws will have been put in place to only train on ethically sourced datasets - which will slow down progress, but is more fair to the creators.

    I don’t care what published works a neural network gets trained on. How else are we supposed to make one? We tried all the clever ways and they don’t work.

    Nothing as miserable as copyright should prevent the obviously transformative act of grinding the entire internet into a couple gigabytes of linear algebra. The more stuff goes in, the less any single piece matters. If the robot can reproduce more than a vague resemblance to particular inputs then that’s a failure called overfitting. A network that can spit out Man Of Steel frame-by-frame won’t be good at much else. We want it to know who Superman is and how capes work. You can’t get that by scanning the same DVD over and over.