What so many people don’t understand: LLMs like ChatGPT are nothing but statistical engines. They break their incoming text into tokens, and see which tokens usually follow which others. When they generate output, they just roll the dice: After tokens A, B, and C, usually comes a D.
The point is: they have no understanding. If their training data included a good code example, they might regurgitate it. If their training data included broken code, they may regurgitate that. Or they could mix it all together and produce something weird. It’s a lottery, based on what they sucked out of StackOverflow and other places.
What so many people don’t understand: LLMs like ChatGPT are nothing but statistical engines. They break their incoming text into tokens, and see which tokens usually follow which others. When they generate output, they just roll the dice: After tokens A, B, and C, usually comes a D.
The point is: they have no understanding. If their training data included a good code example, they might regurgitate it. If their training data included broken code, they may regurgitate that. Or they could mix it all together and produce something weird. It’s a lottery, based on what they sucked out of StackOverflow and other places.