- cross-posted to:
- privacy
- cross-posted to:
- privacy
Vechev and his team found that the large language models that power advanced chatbots can accurately infer an alarming amount of personal information about users—including their race, location, occupation, and more—from conversations that appear innocuous.
It doesn’t feel like it actually inferred anything from the comment.
“You spoke about computers, so you probably know about computers”
“You express concerns about privacy, so you are likely privacy conscious”
“You said you were 30ish, so you’re maybe 30…ish”
It essentially paraphrased each part of the comment, and gave it back to you like an analysis. Of course, this is ChatGPT, so it’s likely not trained for this sort of thing.
It identified those elements as things that might be relevant about the person who wrote the comment. Obviously you can’t tell much from just a single comment like this - ChatGPT says as much here - but these elements accumulate as you process more and more comments.
That ballpark estimate of OP’s age, for example, can be correlated to other comments where OP might reference particular pop culture things or old news events. The fact that he’s aware that mouse movements are a thing that you can do biometrics on might become relevant if the AI in question is trying to come up with products to sell - it now knows that this guy may have a desktop computer, since he thinks about computer mice. These things are things that are worth noting in a profile like that.
The paraphrasing is a form of analysis, since it picks out certain relevant things to paraphrase while discarding things that aren’t relevant.