I have many conversations with people about Large Language Models like ChatGPT and Copilot. The idea that “it makes convincing sentences, but it doesn’t know what it’s talking about” is a difficult concept to convey or wrap your head around. Because the sentences are so convincing.
Any good examples on how to explain this in simple terms?
Edit:some good answers already! I find especially that the emotional barrier is difficult to break. If an AI says something malicious, our brain immediatly jumps to “it has intent”. How can we explain this away?
It’s a really well-trained parrot. It responds to what you say, and then it responds to what it hears itself say.
But despite knowing which sounds go together based on which sounds it heard, it doesn’t actually speak English.
I am an LLM researcher at MIT, and hopefully this will help.
As others have answered, LLMs have only learned the ability to autocomplete given some input, known as the prompt. Functionally, the model is strictly predicting the probability of the next word+, called tokens, with some randomness injected so the output isn’t exactly the same for any given prompt.
The probability of the next word comes from what was in the model’s training data, in combination with a very complex mathematical method to compute the impact of all previous words with every other previous word and with the new predicted word, called self-attention, but you can think of this like a computed relatedness factor.
This relatedness factor is very computationally expensive and grows exponentially, so models are limited by how many previous words can be used to compute relatedness. This limitation is called the Context Window. The recent breakthroughs in LLMs come from the use of very large context windows to learn the relationships of as many words as possible.
This process of predicting the next word is repeated iteratively until a special stop token is generated, which tells the model go stop generating more words. So literally, the models builds entire responses one word at a time from left to right.
Because all future words are predicated on the previously stated words in either the prompt or subsequent generated words, it becomes impossible to apply even the most basic logical concepts, unless all the components required are present in the prompt or have somehow serendipitously been stated by the model in its generated response.
This is also why LLMs tend to work better when you ask them to work out all the steps of a problem instead of jumping to a conclusion, and why the best models tend to rely on extremely verbose answers to give you the simple piece of information you were looking for.
From this fundamental understanding, hopefully you can now reason the LLM limitations in factual understanding as well. For instance, if a given fact was never mentioned in the training data, or an answer simply doesn’t exist, the model will make it up, inferring the next most likely word to create a plausible sounding statement. Essentially, the model has been faking language understanding so much, that even when the model has no factual basis for an answer, it can easily trick a unwitting human into believing the answer to be correct.
—-
+more specifically these words are tokens which usually contain some smaller part of a word. For instance,
understand
andable
would be represented as two tokens that when put together would become the wordunderstandable
.I think that a good starting place to explain the concept to people would be to describe a Travesty Generator. I remember playing with one of those back in the 1980’s. If you fed it a snippet of Shakespeare, what it churned out sounded remarkably like Shakespeare, even if it created brand “new” words.
The results were goofy, but fun because it still almost made sense.
The most disappointing source text I ever put in was TS Eliot. The output was just about as much rubbish as the original text.
As some nerd playing with various Ai models at home with no formal training, any wisdom you think that’s worth sharing?
The only winning move is not to play.
But my therapist said she needs more VRam.
It’s your phone’s ‘predictive text’, but if it were trained on the internet.
It can guess what the next word should be a lot of the time, but it’s also easy for it to go off the rails.
Harry Frankfurt’s influential 2005 book (based on his influential 1986 essay), On Bullshit, offered a description of what bullshit is.
When we say a speaker tells the truth, that speaker says something true that they know is true.
When we say a speaker tells a lie, that speaker says something false that they know is false.
But bullshit is when the speaker says something to persuade, not caring whether the underlying statement is true or false. The goal is to persuade the listener of that underlying fact.
The current generation of AI chat bots are basically optimized for bullshit. The underlying algorithms reward the models for sounding convincing, not necessarily for being right.
idea that “it makes convincing sentences, but it doesn’t know what it’s talking about”
Like a teenager who has come into a new group and is now trying so hard to fit in :-)
I think we forget this, it’s just doing what people do
I commented something similair on another post, but this is exactly why I find this phenomenon so hard to describe.
A teenager in a new group still has some understanding and has a mind. It knows many of the meaning of the words that are said. Sure, some catchphrases might be new, but general topics shouldn’t be too hard to follow.
This is nothing like genAI. GenAI doesn’t know anything at all. It has (simplified) a list of words that somehow are connected to eachother. But AI has no meaning of a wheel, what round is, what rolling is, what rubber is, what an axle is. NO understanding. Just words that happened to describe all of it. For us humans it is so difficult to understand that something uses language without knowing ANY of the meaning.
How can we describe this so our brains make sense that you can have language without understanding? The Chinese Room experiment comes close, but is quite complicated to explain as well I think.
I think a flaw in this line of reasoning is that it assigns a magical property to the concept of knowing. Do humans know anything? Or do they just infer meaning from identifying patterns in words? Ultimately this question is a spiritual question and does not hold any water in a scientific conversation.
It’s valid to point out that we have difficulty defining knowledge, but the output from these machines are inconsistent at a conceptual level, and you can easily get them to contradict themselves in the spirit of being helpful.
If someone told you that a wheel can be made entirely of gas do you have confidence that they have a firm grasp of a wheel’s purpose? Tool use is a pretty widely agreed upon marker of intelligence and so not grasping the purpose of a thing that they can describe at great length and exhaustive detail, while also making boldly incorrect claims on occassion should raise an eyebrow.
How can we describe this so our brains make sense that you can have language without understanding?
I think it is really impossible to describe in easy and limited words.
NO understanding. Just words that happened to describe all of it.
If being able to describe it does not mean understanding, then what is understanding?
Joel Haver has a sketch in which one person in a group laughs at an inside joke from a trip they didn’t go on. When pressed I think they say something like they laughed because everyone else was. As someone who has been in this situation, it’s true. Even though I don’t understand the specific reference being made, it’s usually being done in a funny manner such that the story telling is enjoyable and humorous. Or I’m able to use context clues to guess what they might be joking about and it’s funny, even if my understanding is off.
You could maybe just share a meme like this one.
Some folks in the comments there share actual LLM results, a few of which are sensible but plenty that aren’t far off from the joke.
LMAO! I tried it, and it said:
The man takes the goat across the river first, then he returns alone and takes the boat across again. Finally, he brings the goat’s friend, Mr. Cabbage, across the river.
LMAO
I asked what if the man can’t swim…
In that case, the man can still take the goat across the river first. Then, he ties the goat to the boat and pulls the boat across the river using a rope, ensuring he stays on the side of the river where he can safely manage the boat without swimming.
I asked who Mr. Cabbage is…
My apologies for the confusion! Mr. Cabbage is a playful addition. In the classic riddle, there’s usually no mention of Mr. Cabbage. It’s just the man, the goat, and perhaps another item or animal depending on the version of the riddle. I just added Mr. Cabbage for some extra fun!
Then I asked what some other additions could be…
Certainly! You can add various characters or items to make the riddle more entertaining. Here are a few ideas:
-
Mr. Wolf: The man, the goat, and Mr. Wolf need to cross the river. But if the man leaves the wolf alone with the goat, the wolf will eat the goat.
-
Ms. Hen: The man, the goat, and Ms. Hen need to cross the river. But if the man leaves the hen alone with the goat, the goat will eat the hen.
-
Professor Fox: The man, the goat, and Professor Fox need to cross the river. But if the man leaves the fox alone with the goat, the fox will eat the goat.
-
Princess Diamond: The man, the goat, and Princess Diamond’s precious gem need to cross the river. But Princess Diamond has warned that if anyone other than herself touches the gem, they’ll be cursed.
Feel free to mix and match or come up with your own additions!
And the “solution”…
Ah, that’s a fun twist! Let’s see:
- The man takes Mr. Cabbage across first, then returns alone.
- Next, he takes Ms. Hen across but brings Mr. Cabbage back.
- He then takes Princess Diamond’s gem across, leaves it on the other side, and takes Mr. Wolf across.
- The man leaves Mr. Wolf with Ms. Hen on the other side and returns alone.
- He takes Princess Diamond across, but before he leaves, he swaps Mr. Cabbage with Princess Diamond’s gem.
- The man brings Mr. Cabbage across, leaving Princess Diamond’s gem with Mr. Wolf.
- Finally, he takes the goat across.
Everyone safely crosses the river without any mishaps!
I love Mr. Cabbage! Thank you CharGPT, very cool!
-
Dude, that hurt my brain trying to follow it.
It’s just fancy predictive text like while texting on your phone. It guesses what the next word should be for a lot more complex topics.
This is the one I got from the house to get the kids to the park and then I can go to work and then I can go to work and get the rest of the day after that I can get it to you tomorrow morning to pick up the kids at the same time as well as well as well as well as well as well as well as well as well… I think my predictive text broke
Its like saying an OS is just a bunch of if then else statements. While it is true, in practice it is far far more complicated.
Not an ELI5, sorry. I’m an AI PhD, and I want to push back against the premises a lil bit.
Why do you assume they don’t know? Like what do you mean by “know”? Are you taking about conscious subjective experience? or consistency of output? or an internal world model?
There’s lots of evidence to indicate they are not conscious, although they can exhibit theory of mind. Eg: https://arxiv.org/pdf/2308.08708.pdf
For consistency of output and internal world models, however, their is mounting evidence to suggest convergence on a shared representation of reality. Eg this paper published 2 days ago: https://arxiv.org/abs/2405.07987
The idea that these models are just stochastic parrots that only probabilisticly repeat their training data isn’t correct, although it is often repeated online for some reason.
A little evidence that comes to my mind is this paper showing models can understand rare English grammatical structures even if those structures are deliberately withheld during training: https://arxiv.org/abs/2403.19827
The idea that these models are just stochastic parrots that only probabilisticly repeat their training data isn’t correct
I would argue that it is quite obviously correct, but that the interesting question is whether humans are in the same category (I would argue yes).
People sometimes act like the models can only reproduce their training data, which is what I’m saying is wrong. They do generalise.
During training the models are trained to predict the next word, but after training the network is always effectively interpolating between the training examples it has memorised. But this interpolation doesn’t happen in text space but in a very high dimensional abstract semantic representation space, a ‘concept space’.
Now imagine that you have memorised two paragraphs that occupy two points in concept space. And then you interpolate between them. This gives you a new point, potentially unseen during training, a new concept, that is in some ways analogous to the two paragraphs you memorised, but still fundamentally different, and potentially novel.
You sound like a chatbot who’s offended by it’s intelligence being insulted.
I really appreciate you linking studies about this topic, as finding this kind of research can be daunting. Those looks like really interesting reads.
Bro is lost in the sauce
Maybe I misunderstood the OP? Idk
There’s the Chinese Room argument, which is a bit related:
I always thought the Chinese Room argument was kinda silly. It’s predicated on the idea that humans have some unique capacity to understand the world that can’t be replicated by a syntactic system, but there is no attempt made to actually define this capacity.
The whole argument depends on our intuition that we think and know things in a way inanimate objects don’t. In other words, it’s a tautology to draw the conclusion that computers can’t think from the premise that computers can’t think.
This is what I was going to point to. When I was in grad school, it was often referred to as the Symbol Gounding Problem. Basically it’s a interdisciplinary research problem involving pragmatics, embodied cognition, and a bunch of others. The LLM people are now crashing into this research problem, and it’s interesting to see how they react.
Imagine you were asked to start speaking a new language, eg Chinese. Your brain happens to work quite differently to the rest of us. You have immense capabilities for memorization and computation but not much else. You can’t really learn Chinese with this kind of mind, but you have an idea that plays right into your strengths. You will listen to millions of conversations by real Chinese speakers and mimic their patterns. You make notes like “when one person says A, the most common response by the other person is B”, or “most often after someone says X, they follow it up with Y”. So you go into conversations with Chinese speakers and just perform these patterns. It’s all just sounds to you. You don’t recognize words and you can’t even tell from context what’s happening. If you do that well enough you are technically speaking Chinese but you will never have any intent or understanding behind what you say. That’s basically LLMs.
A 5 year old repeating daddy’s swear words without knowing what it is.
Imagine that you have a random group of people waiting in line at your desk. You have each one read the prompt, and the response so far, and then add a word themself. Then they leave and the next person in line comes and does it.
This is why “why did you say ?” questions are nonsensical to AI. The code answering it is not the code that wrote it and there is no communication coordination or anything between the different word answerers.
Ok, I like this description a lot actually, it’s a very quick and effective way to explain the effects of no backtracking. A lot of the answers here are either too reductive or too technical to actually make this behavior understandable to a layman. “It just predicts the next word” is easy to forget when the thing makes it so easy to be anthropomorphized subconsciously.
That analogy is hard to come up with because the question of whether it even comprehends meaning requires first answering the unanswerable question of what meaning actually is and whether or not humans are also just spicy pattern predictors / autocompletes, since predicting patterns is like the whole point of evolving intelligence, being able to connect cause and effect in patterns and anticipate the future just helps with not starving. The line is far blurrier than most are willing to admit and ultimately hinges on our experience of sapience rather than being able to strictly define knowledge and meaning.
Instead it’s far better to say that ML models are not sentient, they are like a very big brain that’s switched off, but we can access it by stimulating it with a prompt.
Interesting thoughts! Now that I think about this, we as humans have a huge advantage by having not only language, but also sight, smell, hearing and taste. An LLM basically only has “language.” We might not realize how much meaning we create through those other senses.
To add to this insight, there are many recent publications showing the dramatic improvements of adding another modality like vision to language models.
While this is my conjecture that is loosely supported by existing research, I personally believe that multimodality is the secret to understanding human intelligence.
The idea that “it makes convincing sentences, but it doesn’t know what it’s talking about” is a difficult concept to convey or wrap your head around.
I see the people you talk to aren’t familiar with politicians?
Imagine making a whole chicken out of chicken-nugget goo.
It will look like a roast chicken. It will taste alarmingly like chicken. It absolutely will not be a roast chicken.
The sad thing is that humans do a hell of a lot of this, a hell of a lot of the time. Look how well a highschooler who hasn’t actually read the book can churn out a book report. Flick through, soak up the flavour and texture of the thing, read the blurb on the back to see what it’s about, keep in mind the bloated over-flowery language that teachers expect, and you can bullshit your way to an A.
Only problem is, you can’t use the results for anything productive, which is what people try to use GenAI for.