I have many conversations with people about Large Language Models like ChatGPT and Copilot. The idea that “it makes convincing sentences, but it doesn’t know what it’s talking about” is a difficult concept to convey or wrap your head around. Because the sentences are so convincing.
Any good examples on how to explain this in simple terms?
Edit:some good answers already! I find especially that the emotional barrier is difficult to break. If an AI says something malicious, our brain immediatly jumps to “it has intent”. How can we explain this away?
idea that “it makes convincing sentences, but it doesn’t know what it’s talking about”
Like a teenager who has come into a new group and is now trying so hard to fit in :-)
I think we forget this, it’s just doing what people do
I commented something similair on another post, but this is exactly why I find this phenomenon so hard to describe.
A teenager in a new group still has some understanding and has a mind. It knows many of the meaning of the words that are said. Sure, some catchphrases might be new, but general topics shouldn’t be too hard to follow.
This is nothing like genAI. GenAI doesn’t know anything at all. It has (simplified) a list of words that somehow are connected to eachother. But AI has no meaning of a wheel, what round is, what rolling is, what rubber is, what an axle is. NO understanding. Just words that happened to describe all of it. For us humans it is so difficult to understand that something uses language without knowing ANY of the meaning.
How can we describe this so our brains make sense that you can have language without understanding? The Chinese Room experiment comes close, but is quite complicated to explain as well I think.
How can we describe this so our brains make sense that you can have language without understanding?
I think it is really impossible to describe in easy and limited words.
NO understanding. Just words that happened to describe all of it.
If being able to describe it does not mean understanding, then what is understanding?
Not an ELI5, sorry. I’m an AI PhD, and I want to push back against the premises a lil bit.
Why do you assume they don’t know? Like what do you mean by “know”? Are you taking about conscious subjective experience? or consistency of output? or an internal world model?
There’s lots of evidence to indicate they are not conscious, although they can exhibit theory of mind. Eg: https://arxiv.org/pdf/2308.08708.pdf
For consistency of output and internal world models, however, their is mounting evidence to suggest convergence on a shared representation of reality. Eg this paper published 2 days ago: https://arxiv.org/abs/2405.07987
The idea that these models are just stochastic parrots that only probabilisticly repeat their training data isn’t correct, although it is often repeated online for some reason.
A little evidence that comes to my mind is this paper showing models can understand rare English grammatical structures even if those structures are deliberately withheld during training: https://arxiv.org/abs/2403.19827
You sound like a chatbot who’s offended by it’s intelligence being insulted.
The idea that these models are just stochastic parrots that only probabilisticly repeat their training data isn’t correct
I would argue that it is quite obviously correct, but that the interesting question is whether humans are in the same category (I would argue yes).
People sometimes act like the models can only reproduce their training data, which is what I’m saying is wrong. They do generalise.
During training the models are trained to predict the next word, but after training the network is always effectively interpolating between the training examples it has memorised. But this interpolation doesn’t happen in text space but in a very high dimensional abstract semantic representation space, a ‘concept space’.
Now imagine that you have memorised two paragraphs that occupy two points in concept space. And then you interpolate between them. This gives you a new point, potentially unseen during training, a new concept, that is in some ways analogous to the two paragraphs you memorised, but still fundamentally different, and potentially novel.
Bro is lost in the sauce
Maybe I misunderstood the OP? Idk
Harry Frankfurt’s influential 2005 book (based on his influential 1986 essay), On Bullshit, offered a description of what bullshit is.
When we say a speaker tells the truth, that speaker says something true that they know is true.
When we say a speaker tells a lie, that speaker says something false that they know is false.
But bullshit is when the speaker says something to persuade, not caring whether the underlying statement is true or false. The goal is to persuade the listener of that underlying fact.
The current generation of AI chat bots are basically optimized for bullshit. The underlying algorithms reward the models for sounding convincing, not necessarily for being right.
It’s your phone’s ‘predictive text’, but if it were trained on the internet.
It can guess what the next word should be a lot of the time, but it’s also easy for it to go off the rails.
You could maybe just share a meme like this one.
Some folks in the comments there share actual LLM results, a few of which are sensible but plenty that aren’t far off from the joke.
Dude, that hurt my brain trying to follow it.
It’s like talking to a republican!
A 5 year old repeating daddy’s swear words without knowing what it is.
Imagine that you have a random group of people waiting in line at your desk. You have each one read the prompt, and the response so far, and then add a word themself. Then they leave and the next person in line comes and does it.
This is why “why did you say ?” questions are nonsensical to AI. The code answering it is not the code that wrote it and there is no communication coordination or anything between the different word answerers.
Ok, I like this description a lot actually, it’s a very quick and effective way to explain the effects of no backtracking. A lot of the answers here are either too reductive or too technical to actually make this behavior understandable to a layman. “It just predicts the next word” is easy to forget when the thing makes it so easy to be anthropomorphized subconsciously.
Imagine you were asked to start speaking a new language, eg Chinese. Your brain happens to work quite differently to the rest of us. You have immense capabilities for memorization and computation but not much else. You can’t really learn Chinese with this kind of mind, but you have an idea that plays right into your strengths. You will listen to millions of conversations by real Chinese speakers and mimic their patterns. You make notes like “when one person says A, the most common response by the other person is B”, or “most often after someone says X, they follow it up with Y”. So you go into conversations with Chinese speakers and just perform these patterns. It’s all just sounds to you. You don’t recognize words and you can’t even tell from context what’s happening. If you do that well enough you are technically speaking Chinese but you will never have any intent or understanding behind what you say. That’s basically LLMs.
All of this also touches upon an interesting topic. What it really means to understand something? Just because you know stuff and may even be able to apply it in flexible ways, does that count as understanding? I’m not a philosopher, so I don’t even know how to approach something like this.
Anyway, I think the main difference is the lack of personal experience about the real world. With LLMs, it’s all second hand knowledge. A human could memorize facts like how water circulates between rivers, lakes and clouds, and all of that information would be linked to personal experiences, which would shape the answer in many ways. An LLM doesn’t have such experiences.
Another thing would be reflecting on your experiences and knowledge. LLMs do none of that. They just speak whatever “pops in their mind”, whereas humans usually think before speaking… Well at least we are capable of doing that even though we may not always take advantage of this super power. Although, the output of an LLM can be monitored and abruptly deleted as soon as it crosses some line. It’s sort of like mimicking the thought processes you have inside your head before opening your mouth.
Example: Explain what it feels like to have an MRI taken of your head. If you haven’t actually experienced that yourself, you’ll have to rely on second hand information. In that case, the explanation will probably be a bit flimsy. Imagine you also read all the books, blog posts and and reddit comments about it, and you’re able to reconstruct a fancy explanation regardless.
This lack of experience may hurt the explanation a bit, but an LLM doesn’t have any experiences of anything in the real world. It has only second hand descriptions of all those experiences, and that will severely hurt all explanations and reasoning.
I feel like you’re already not getting it and therefore giving too much credit to the LLM.
With LLMs it’s not even about second hand knowledge, the concept of knowledge does not apply to LLMs at all, it’s literally just about statistics, eg. what is the most likely next output after this token.
You could argue that embeddings constitute some kind of stored knowledge. But I do agree with your larger point, LLMs are getting to much credit because of the language we use to describe them
The idea that “it makes convincing sentences, but it doesn’t know what it’s talking about” is a difficult concept to convey or wrap your head around.
I see the people you talk to aren’t familiar with politicians?
It’s like your 5 year old daughter, relaying to you what she made of something she heard earlier.
That’s my analogy. ChatGPT kind of has the intellect and ability to differentiate between facts and fiction of a 5 year old. But it combines that with the writing style of a 40 year old with a uncanny love of mixing adjectives and sounding condescending.
Imagine making a whole chicken out of chicken-nugget goo.
It will look like a roast chicken. It will taste alarmingly like chicken. It absolutely will not be a roast chicken.
The sad thing is that humans do a hell of a lot of this, a hell of a lot of the time. Look how well a highschooler who hasn’t actually read the book can churn out a book report. Flick through, soak up the flavour and texture of the thing, read the blurb on the back to see what it’s about, keep in mind the bloated over-flowery language that teachers expect, and you can bullshit your way to an A.
Only problem is, you can’t use the results for anything productive, which is what people try to use GenAI for.
Like a kid trying very hard to sound like everyone else. “Eloquent bullshit generator”
So there’s two different things to what you are asking.
(1) They don’t know what (i.e. semantically) they are talking about.
This is probably not the case, and there’s very good evidence over the past year in research papers and replicated projects that transformer models do pick up world models from the training data such that they are aware and integrating things at a more conceptual level.
For example, even a small toy GPT model trained only on chess moves builds an internal structure of the whole board and tracks “my pieces” and “opponent pieces.”
(2) Why do they say dumb shit that’s clearly wrong and don’t know.
They aren’t knowledge memorizers. They are very advanced pattern extenders.
Where the answer to a question is part of the pattern they can successfully extend, they get the answer correct. But if it isn’t, they confabulate an answer in a similar way to stroke patients who don’t know that they don’t know the answer to something and make it up as they go along. Similar to stroke patients, you can even detect when this is happening with a similar approach (ask 10x and see how consistent the answer is or if it changes each time).
They aren’t memorizing the information like a database. They are building ways to extend input into output in ways that match as much information as they can be fed. In this, they are beyond exceptional. But they’ve been kind of shoehorned into the initial tech demo usecase of “knowledgeable chatbot” which is a less than ideal use. The fact they were even good at information recall was a surprise to most researchers.
Thanks for your thorough answer.
I’ll see if I can find that article/paper about the chess moves. That sounds interesting!
Could it be that we ascribe an LLM with conceptual knowledge while in fact it is by chance? We as humans are masters at seeing patterns that aren’t there. But then again, like another commenter said, maybe the question is more about conscience itself, and what that actually means. What it means to “understand” something.
So the paper that found that particular bit in Othello was this one: https://arxiv.org/abs/2310.07582
Which was building off this earlier paper: https://arxiv.org/abs/2210.13382
And then this was the work replicating it in Chess: https://www.lesswrong.com/posts/yzGDwpRBx6TEcdeA5/a-chess-gpt-linear-emergent-world-representation
It’s not by chance - there’s literally interventions where flipping a weight or vector results in the opposite behavior (like acting like a piece is in a different place, or playing well he badly no matter the previous moves).
But it’s more that it seems unlikely that there’s any actual ‘feeling’ or ‘conscious’ sentience/consciousness to understand beyond the model knowing what the abstracted pattern means in relation to the inputs and outputs. It probably is simulating some form of ego and self, but not actively experiencing it if it makes sense.