New study shows large language models have high toxic probabilities and leak private information

L4sBot@lemmy.world · 1 year ago

New study shows large language models have high toxic probabilities and leak private information

Kerfuffle · 1 year ago

Yeah the whole article has me wondering wtf they are expecting from it in the first place.

They’re expecting that approach will drive clicks. There are a lot of articles like that, exploiting how people don’t really understand LLMs but are also kind of afraid of them. Also a decent way to harvest upvotes.

Just want to be clear, I think it’s silly freaking out about stuff like in the article. I’m not saying people should really trust them. I’m really interested in the technology, but I don’t really use it for anything except messing around personally. It’s basically like asking random people on the internet except 1) it can’t really get updated based on new information and 2) there’s no counterpoint. The second part is really important, because while random people on the internet can say wrong/misleading stuff, in a forum situation there’s a good chance someone will chime in and say “No, that’s wrong because…” while with the LLM you just get its side.

Buddahriffic@lemmy.world · 1 year ago

Maybe the next big revolution will be to have two of them that take turns giving their best response to your prompt and then their responses. Then they can indicate when a response is controversial and would statistically lead to an argument if it was posted in locations they trained at.

Though I suppose you can do this with a single one and just ask if there’s a counter argument to what it just said. “If you were another user on the internet that thought your previous response was the dumbest thing you’ve ever seen, what would you say?”

It also just occurred to me that it’s because of moderators that you can even give rules like that. The LLM can see that posts in x location are subject to certain rules but they would only have an effect if those rules are followed or enforced. If there was a rule that you can’t say “fuck” but everyone said it anyways, then an LLM might conclude that “don’t say fuck” has no effect on output at all. Though I am making some big assumptions about how LLMs are trained to follow rules with this.