CEO Steve Huffman says tech giants should not be able to trawl Reddit’s huge store of data for free. But that information came from users, not the company

That “corpus of data” is the content posted by millions of Reddit users over the decades. It is a fascinating and valuable record of what they were thinking and obsessing about. Not the tiniest fraction of it was created by Huffman, his fellow executives or shareholders. It can only be seen as belonging to them because of whatever skewed “consent” agreement its credulous users felt obliged to click on before they could use the service.

Ouch

  • Sens@feddit.uk
    link
    fedilink
    English
    arrow-up
    12
    ·
    edit-2
    2 years ago

    This was my thought as well, I actually don’t mind OpenAI trawling my content to train their models, I’m benefiting from their end product in so many ways already. The internet was always public, no one asked for stupid ceos to step in and stop that. How is it Ok for Google webcrawlers, but not OpenAI? Also it’s not like I can monitise my posts and comments myself on my own anyway.

    The whole locking down the API due to AI model scraping excuse was poor, it should be a decision for the community of reddit.

    Starting to wonder if Reddit are going to train their own AI models or have already started.

    Also, that journalist from the guardian, if you go to the website linked, looks like an older John Oliver or John Oliver’s dad 😂