‘AI is reliant on mass surveillance’ and we should be cautious, warns head of messaging app

doodledup@lemmy.world · 8 months ago

‘AI is reliant on mass surveillance’ and we should be cautious, warns head of messaging app

Possibly linux@lemmy.zip · 8 months ago

This isn’t entirely true. AI is usually trained on public data such as Wikipedia.

AI is a tool. How you use it is what matters.

Wave@lemmy.ml · 8 months ago

deleted by creator

Possibly linux@lemmy.zip · 8 months ago

I self host so I don’t care

31337 · edit-2 8 months ago

It’s also trained on data people reasonably expected would be private (private github repos, Adobe creative cloud, etc). Even if it was just public data, it can still be dangerous. I.e. It could be possible to give an LLM a prompt like, “give me a list of climate activists, their addresses, and their employers” if it was trained on this data or was good at “browsing” on its own. That’s currently not possible due to the guardrails on most models, and I’m guessing they try to avoid training on personal data that’s public, but a government agency could make an LLM without these guardrails. That data could be public, but would take a person quite a bit of work to track down compared to the ease and efficiency of just asking an LLM.

Possibly linux@lemmy.zip · 8 months ago

What you are describing is highly specific to a particular AI model.

Kilgore Trout@feddit.it · edit-2 8 months ago

Wikipedia requires attribution, which AI scrapers never give.

It is “public” work, but under a license.

Possibly linux@lemmy.zip · 8 months ago

Still public data

‘AI is reliant on mass surveillance’ and we should be cautious, warns head of messaging app

‘AI is reliant on mass surveillance’ and we should be cautious, warns head of messaging app

‘AI is reliant on mass surveillance’ and we should be cautious, warns head of messaging app | 7.30