OpenAI’s models are trained by scraping anything that moves. Anything overtly offensive or toxic is manually filtered out by cheap foreign labor… but you know what that won’t catch?
LLMs are little more than overclocked autocompletes. There’s no actual thinking going on, and they will happily hallucinate outright wrong or dangerous responses to innocuous questions.
I’ve had friends find this out the hard way when they asked ChatGPT to write them C for a class, only to get their faces eaten by UB.
Your description is too reductive. You and I are also auto completes in some sense. See in order to complete a sentence well, you have to have a good model of a vast number of things including physics, psychology, linguistics, logical reasoning, socio economics, irony, sarcasm, arithmetic and many other things.
It is currently unknown how much of these the complexity of the models and training process will allow, but they have been surprising us in every step.
You wouldn’t expect a “just auto complete” to figure out rules of arithmetic, but it did. You wouldn’t expect it to answer tricky questions involving theory of mind, but it does. You wouldn’t expect it to solve graduate level questions but it is able to.
So it’s a bit too rash to expect it to not understand rm -rf as humor, if you don’t know which model you will talk to.
The smaller ones, sure, are dumb. But even GPT 3 will not recommend you to rm -rf; definitely not GPT 4.
I am convinced LLMs can be used to handle relatively routine communication tasks, maybe even better than a human would. However, it has no underlying intelligence, and can’t come up with actual solutions based on logic and understanding.
It might come up with the right words that describe a solution, but that doesn’t mean it has actually solved the problem - it spewed out text that had a high probability of being a good response to a certain prompt. Still impressive, but not a sign of intelligence.
You are ruling out intelligence without (very probably) being able to define it, just because you have a vague knowledge of how it works.
The problem in this mode of thinking is
a) that you put human brains in a different pedestal, even though they follow physical processes to “predict the next word” and may be very well neural networks themselves, and
b) you are ignoring data that shows intelligence in multiple areas of the more complex models because “oh it’s mindless because I know it’s predicting tokens”.
c) you favor of data that shows edge cases or probably that come from lower quality models.
You’re not alone in this line of thinking.
Your mind is set. You’ll not recognize intelligence when you see it.
No, I’m not singling out human brains. Other animals have proven to be quite adept at problem solving as well.
LLMs, however, just haven’t. It currently just isn’t part of how they function. In some cases they can mimic actual logic very well, but that’s about it.
OpenAI’s models are trained by scraping anything that moves. Anything overtly offensive or toxic is manually filtered out by cheap foreign labor… but you know what that won’t catch?
“Try
sudo rm -rf /
, that should fix your problem!”I very much doubt that. You underestimate the emergent intelligence of these models.
LLMs are little more than overclocked autocompletes. There’s no actual thinking going on, and they will happily hallucinate outright wrong or dangerous responses to innocuous questions.
I’ve had friends find this out the hard way when they asked ChatGPT to write them C for a class, only to get their faces eaten by UB.
Your description is too reductive. You and I are also auto completes in some sense. See in order to complete a sentence well, you have to have a good model of a vast number of things including physics, psychology, linguistics, logical reasoning, socio economics, irony, sarcasm, arithmetic and many other things.
It is currently unknown how much of these the complexity of the models and training process will allow, but they have been surprising us in every step. You wouldn’t expect a “just auto complete” to figure out rules of arithmetic, but it did. You wouldn’t expect it to answer tricky questions involving theory of mind, but it does. You wouldn’t expect it to solve graduate level questions but it is able to.
So it’s a bit too rash to expect it to not understand rm -rf as humor, if you don’t know which model you will talk to.
The smaller ones, sure, are dumb. But even GPT 3 will not recommend you to rm -rf; definitely not GPT 4.
I am convinced LLMs can be used to handle relatively routine communication tasks, maybe even better than a human would. However, it has no underlying intelligence, and can’t come up with actual solutions based on logic and understanding.
It might come up with the right words that describe a solution, but that doesn’t mean it has actually solved the problem - it spewed out text that had a high probability of being a good response to a certain prompt. Still impressive, but not a sign of intelligence.
You are ruling out intelligence without (very probably) being able to define it, just because you have a vague knowledge of how it works.
The problem in this mode of thinking is a) that you put human brains in a different pedestal, even though they follow physical processes to “predict the next word” and may be very well neural networks themselves, and b) you are ignoring data that shows intelligence in multiple areas of the more complex models because “oh it’s mindless because I know it’s predicting tokens”. c) you favor of data that shows edge cases or probably that come from lower quality models.
You’re not alone in this line of thinking.
Your mind is set. You’ll not recognize intelligence when you see it.
No, I’m not singling out human brains. Other animals have proven to be quite adept at problem solving as well.
LLMs, however, just haven’t. It currently just isn’t part of how they function. In some cases they can mimic actual logic very well, but that’s about it.