Researchers at Apple have come out with a new paper showing that large language models can’t reason — they’re just pattern-matching machines. [arXiv, PDF] This shouldn’t be news to anyone here. We …
When you ask an LLM a reasoning question. You’re not expecting it to think for you, you’re expecting that it has crawled multiple people asking semantically the same question and getting semantically the same answer, from other people, that are now encoded in its vectors.
That’s why you can ask it. because it encodes semantics.
thank you for bravely rushing in and providing yet another counterexample to the “but nobody’s actually stupid enough to think they’re anything more than statistical language generators” talking point
It’s worth pointing out that it does happen to reconstruct information remarkably well considering it’s just likelihood. They’re pretty useful tools like any other, it’s funny ofc to watch silicon valley stumble all over each other chasing the next smartphone.
if it really did so, performance wouldn’t swing up or down when you change syntactic or symbolic elements of problems. the only information encoded is language-statistical
When you ask an LLM a reasoning question. You’re not expecting it to think for you, you’re expecting that it has crawled multiple people asking semantically the same question and getting semantically the same answer, from other people, that are now encoded in its vectors.
That’s why you can ask it. because it encodes semantics.
so… a stochastic parrot?
Rooting around for that Luke Skywalker “every single word in that sentence was wrong” GIF…
thank you for bravely rushing in and providing yet another counterexample to the “but nobody’s actually stupid enough to think they’re anything more than statistical language generators” talking point
Paraphrasing Neil Gaiman, LLMs don’t give you information; they give you information shaped sentences.
They don’t encode semantics. They encode the statistical likelihood that each token will follow a given sequence of tokens.
It’s worth pointing out that it does happen to reconstruct information remarkably well considering it’s just likelihood. They’re pretty useful tools like any other, it’s funny ofc to watch silicon valley stumble all over each other chasing the next smartphone.
“remarkably well” as long as the remark is “this is still garbage!”
did you ask a LLM for a post to make here? that might explain this mess of a comment
if it really did so, performance wouldn’t swing up or down when you change syntactic or symbolic elements of problems. the only information encoded is language-statistical