Nvidia, Apple, and others allegedly trained AI using 173,000 YouTube videos — professional creators frustrated by latest AI training scandal: Report

lemme in@lemm.ee · 4 months ago

Nvidia, Apple, and others allegedly trained AI using 173,000 YouTube videos — professional creators frustrated by latest AI training scandal: Report

mindbleach · 4 months ago

I truly do not understand why anyone gives a shit.

Someone downloaded subtitles from Youtube. Good, frankly. Fuck API TOS. People will save data that’s sent. You can’t serve files to any rando with a browser and pretend they’re a secret. I have used youtube-dl exclusively in lieu of the actual website.

They compiled it for anyone to train models on. “Anyone” included the few giants who already have oodles of data… like Google, the owners of Youtube. And that’s a problem somehow? “However, this idyllic dream of supporting the little guy with The Pile has become another fuel source for major corporations to train AI, rather than DIYers.” You mean in addition to DIYers. It’s still a big open thing for anyone to use.

Am I supposed to be mad because of copyright? I don’t even respect copyright for works of art that cost a billion dollars. I’m not getting excited over audience transcripts of some guy reviewing gizmos.

Models will scan every book in the library, every movie that’s streaming, and every JPG on the internet. No kidding they might scan Youtube videos. Or in this case, the possibly-automated subtitles of Youtube videos.