Assuming said data scraping is a real concern for both Twitter and Reddit, are Fediverse servers at similar risk from scrapers and various automated API hits?
I don’t really know enough about networks to answer.
I think the data scraping problem is more of an opportunity cost (they think AIs should pay them more to use their content) than a concern for the traffic they account for. If traffic, and not profit, was a problem, Wikipedia would start saying they can’t support AIs either.
You make a great point about Wikipedia - it’s laughable to me that scraping is actually why Twitter is doing this. They’re just trying to find a convenient reason for why they’re failing that doesn’t stem from their own incompetence.
The idea that “AI scraping” is any more expensive than search engine indexing is flatly nonsense, only credible to people who have never run any network service at scale.
Assuming said data scraping is a real concern for both Twitter and Reddit, are Fediverse servers at similar risk from scrapers and various automated API hits? I don’t really know enough about networks to answer.
I think the data scraping problem is more of an opportunity cost (they think AIs should pay them more to use their content) than a concern for the traffic they account for. If traffic, and not profit, was a problem, Wikipedia would start saying they can’t support AIs either.
You make a great point about Wikipedia - it’s laughable to me that scraping is actually why Twitter is doing this. They’re just trying to find a convenient reason for why they’re failing that doesn’t stem from their own incompetence.
If you were feeling generous, you could grant that scraping Twitter is a problem.
Of course, I’m sure jacking up the API rates had absolutely no effect on that though. Which means either way, the problem was caused by Elon.
The idea that “AI scraping” is any more expensive than search engine indexing is flatly nonsense, only credible to people who have never run any network service at scale.
Folks need to learn about Common Crawl. https://commoncrawl.org/