In Leak, Facebook Partner Brags About Listening to Your Phone’s Microphone to Serve Ads for Stuff You Mention

pooh [she/her, love/loves]@hexbear.net · 5 months ago

In Leak, Facebook Partner Brags About Listening to Your Phone’s Microphone to Serve Ads for Stuff You Mention

Camdat [none/use name]@hexbear.net · edit-2 5 months ago

This is maybe my biggest pet peeve. These companies are not listening to you in any meaningful way.

You can trivially confirm this by hooking up your home network to Wireshark and filtering packets.

Other reasons:

They can get all of this information elsewhere: searches, ad pixels, location capturing etc.
Processing audio data is basically impossible on-device in a useful way, and the network infrastructure to support mass transcriptions on the cloud would be on the order of billions.
It would be a massive endeavor to cover up the millions of hours of audio data that would need to be analyzed by the lowest paid and most unhappy workers in the industry (content labelers and moderators)

Now I’m sure this is some marketers wet dream, but the logistical and PR nightmare this would create dissuades all but the dumbest ad agencies. This is mostly just terrible tech journalism.

blame [they/them]@hexbear.net · 5 months ago

Not that I disagree with your conclusion because there’s an even simpler way to check if an app is listening: iOS and Android will tell you the mic is being used… Anyway, we do have always-on NNs listening for keywords (“Siri,”, “Hey google”, “Alexa”) so I agree that full ass voice transcription like whisper will run like dogshit on your phone they can certainly run a much much lighter model to pick up a handful of keywords.

bortsampson [he/him, any]@hexbear.net · 5 months ago

deleted by creator

blame [they/them]@hexbear.net · 5 months ago

To Camdat’s point, a general transcription is definitely not low power even if you have some kind of gating on when it transcribes. Obviously Apple and Google and Samsung and whoever makes the phone can turn on the mic without you knowing, otherwise how would their voice assistant work, but Apple probably isn’t letting Facebook have access to the mic without throwing something up on the status bar.

bortsampson [he/him, any]@hexbear.net · 5 months ago

deleted by creator

Camdat [none/use name]@hexbear.net · 5 months ago

Whatsapp is sending your audio to the cloud to handle transcription. This is not an accurate test because it is not an on-device process.

bortsampson [he/him, any]@hexbear.net · 5 months ago

deleted by creator

Camdat [none/use name]@hexbear.net · 5 months ago

Sure this is definitely true. I should clarify that single-word NNs do run on-device all the time, but those require specialized models that are trained only on those keywords. Once those models trigger they need to send everything else to the cloud.

blame [they/them]@hexbear.net · edit-2 5 months ago

I agree. If I was going to do something like this for advertising though I wouldn’t really care too much about what people were saying so instead I’d just listen for some limited set of keywords (maybe for some of my top paying advertisers) and serve ads for keywords that hit recently. Keep it all on device until an ad actually needs to be served.

RyanGosling [none/use name]@hexbear.net · 5 months ago

Not to mention cross site trackers owned by Google and Facebook.

Camdat [none/use name]@hexbear.net · edit-2 5 months ago

I think people greatly underestimate (or misunderstand) the pervasiveness of ad tracking pixels.

Basically any website that has ads or tries to sell you something has a tracking pixel. These pixels create profiles of devices and track almost everything you do while interacting with those sites.

These pixels don’t require any actual “information” about you, they’re only interested in what you (via the device you’re browsing on) will buy. They also don’t use cookies anymore, it’s usually a combination of user agent, IP address, and coarse location. As you said, companies will generally share these profiles.

hotcouchguy [he/him]@hexbear.net · 5 months ago

Eh, I dunno. I remember making exactly those points 20 years ago, but I think it’s pretty feasible now. There are open source NNs that look like they can do this locally on mediocre phones. And if the output is garbage quality, that’s ok, it just has to be good enough to sell some ads. I think it’s largely feasible, although I’m sure it’s inflated by startups looking to impress clients and investors.

Camdat [none/use name]@hexbear.net · 5 months ago

Feel free to Wireshark your smart devices and confirm what I’ve said yourself. The most efficient way to do this is the pixels that already exist on almost every site.

On-device NNs use insane amounts of processing, even on “high-end” phones. You would notice if there was a always-on NN running on your device, this is also something you can try for yourself.

hotcouchguy [he/him]@hexbear.net · edit-2 5 months ago

And what exactly am I looking for in wireshark? A few KB of encrypted text data occasionally sent to who-knows-where? Mixed in among a flood of other tracking bullshit and general wasteful bloat? Yeah lemme go check real quick.

Computationally, we’ve had low-quallity speech to text on home PCs for like 30 years, and we’ve had OK-quality NN implementations for like 15 years. Yes it would be a bit wasteful, but a trimmed-down NN could easily hide among the general bloat of modern software.

Yes it would be kind of a clunky and impractical way to collect data compared to other methods, but it’s definitely plausible that an adtech startup could hack together a semi-functional version of this and then slap it in a slide deck. It would let them say “AI” more times during their pitch.

Camdat [none/use name]@hexbear.net · 5 months ago

You can filter by device. Leave your suspect device connected to your network for a few days, filter by destination and review. Also keep an eye on CPU usage.

If your devices have a ton of random outgoing network requests you’re already being tracked in a myriad of other ways and need to lock your shit down.

I’ve done this before, there’s not as much network bloat as you might think.

ganymede · edit-2 5 months ago

it sounds like you have enough knowledge to know it’s almost impossible for an individual to assert it absolutely 100% isn’t happening.

imo if you make an honest effort to break the technical problem down you will arrive at a different conclusion - or in the very least not be nearly so bold as to allow this to be an emotional peeve.

consider forgetting the propaganda the media has subjected you to, and most importantly forget whether you do or don’t want it to be true. approach the problem from a purely technical perspective while considering these companies can hire hundreds of very smart people from a variety of subdisciplines. recall these companies have virtually bottomless greed and almost exactly 0 morals.

Camdat [none/use name]@hexbear.net · edit-2 5 months ago

The Internet and smartphones are not mystical devices. This is something you can independently confirm yourself very easily.

I have the knowledge necessary to say this 100% does not occur on devices that I own.

ganymede · edit-2 5 months ago

The Internet and smartphones are not mystical devices.

Whether they’re mystical or not is an entirely different conversation ;p

This is something you can independently confirm yourself very easily…

you are vastly understating how non-trivial this task is. or you are allowing your emotional desires to cloud your technical analysis.

teams of experts put in months at a time to assess only a fraction of the required scope. these experts are putting in so much time while admitting they couldn’t achieve full coverage despite having financial backing & well trained teams. it’s reasonably unlikely so many experts would dedicate so much time & resources if its such an easy thing to independently confirm.

if Camdat and ganymede were sitting with one of their nontechnical friends, and their friend says “hey my stock smart device which i only use with facebook and a few things seemed like it eavesdropped on my voice about <common product/brand>”. and they swear they didn’t reveal it via some other channel etc. blah blah we’ve all heard it many times.

if you, Camdat listed all the reasons why the same phenomena can likely be attributed to a variety of other surveillance and correlation methods, some of which are arguably at least as scary. i would likely agree with every single thing you said.

imo its wiser to leave it at that, rather than making the assertion its absolutely not happening, or getting frustrated with them for even wondering.

sunshine [none/use name]@hexbear.net · 5 months ago

Your posts in this thread have been very helpful! thank you!

whogivesashit@lemmygrad.ml · 5 months ago

I don’t know if I would say it’s impossible, but in my experience I feel like it’s unlikely.

Also there is similarly a very large pool of impossibly smart people who don’t work for these companies, and who spend a lot of time looking for all kinds of nefarious stuff like this. It would be very unlikely that they could hide something of this nature from the entire world of people who own these devices.

ganymede · edit-2 5 months ago

in each of the studies i’ve read, if you dig past the popsci headlines reported in the media, and into the actual academic claims being made. everyone i’ve read has been quite upfront about the limits of the study and how they’ve been unable to achieve full scope to absolutely rule it out. if you know of any absolutely conclusive full binary analysis please link.

tbh i don’t mind people saying they think its not happening, or that its unlikely etc

saying it’s absolutely not happening is a very different thing. and a very difficult assertion to justify.

it’s always something like “it’s impossible cos its too much data to record everyone 247/365” when even a tiny bit of common sense, (even if one knows nothing about computers, networks or even audio) could quickly conceive of the idea that some simple mechanism might detect noise thresholds and not need to record 24/7. you don’t even need to be technically minded to work that part out.

i could go on and dig into the actual technical aspects, but the main point is it’s always some unbelievably contrived scenario. basically fabricating low hanging fruit which is so low its underground. and then declaring that not only is everything 100.000% safe, but its actually a peeve that you even wondered.

whogivesashit@lemmygrad.ml · 5 months ago

Yeah I’m not familiar with any particular research that completely rules it out.

I don’t think it’s so much that it would be impossible to conceive of them being able to record you in short bursts. It’s more so, the amount of computing power to process even small amounts of audio data on a large scale.

And beyond that, not that I think it’s not possible for that to be done either, but understanding that these are capitalist systems that will engage with whatever is most profitable.

It has already been shown that it is quite easy to track people through all of the other methods already in place and serve those advertisements very well, which is probably much more cost effective than the audio stuff.

Although with the progression of some of these machine learning models, the equation may look a little different before too long.

ganymede · edit-2 5 months ago

I don’t think it’s so much that it would be impossible to conceive of them being able to record you in short bursts.

that’s exactly my point. if there’s an argument to be made over a technical aspect, why undermine it with some nonsensical requirements? imo it really suggests an emotional desire for it not to be true, which just compromises the integrity of any subsequent technical analysis.

as for the actual technical analysis, i’m always up to discuss each aspect of it :)

regarding the computing requirements for audio, this is something well worth looking into.

human vocal frequencies are quite narrowband compared with the audio most people think of with their music, gaming and movies/episodes.

CD quality audio is 16-bit 44.1 kHz sample rate, modern ‘high fidelity’ audio is in the realm of 24-bit 96 kHz or 192 kHz sample rate.

compare with even ancient voice codecs where bandlimited sampling requirements are only 6.6 kHz and 8bit samples can produce an effective 12bit response! that’s almost half a century ago btw!

the telecommunications industry has put considerable effort into understanding the human voice and the kinds of margins they can use to be profitable. they can even estimate the differential energy footprint based on different choice of words and tones in a conversation, this stuff has been studied quite a bit, for decades.

therefore the audio computational requirements are quite a bit less than i think alot of people realise. but we can ofc go deeper with the technical analysis into a variety of subdisciplines for the computational requirements to be substantially reduced even further.

understanding that these are capitalist systems

regardless of the reduced costs alluded to above, i think the capitalist system is another insight for us to examine. they are boundlessly greedy, nothing is ever enough.

there’s always been the argument they ‘have enough data already’, (and that is a good argument, because they do have enough).

but when has ‘enough’ ever been sufficient for these systems? they already had cookies, but they wanted tracking pixels. and when they had tracking pixels, they devised browser fingerprinting. but that still wasn’t enough, so they started devising audio beacons, but that wasn’t enough, then they started spying on shopping center wireless traffic. etc etc

it’s never ever enough. when we demand infinite growth on a finite planet, it will never be enough.

and imo it doesn’t actually need to be directly profitable in effect, only to be marketed as such to feed their bottomless appetite. especially when correlated surveillance is highly prized, and an additional channel or medium adds value to the existing gathered surveillance.

Although with the progression of some of these machine learning models, the equation may look a little different before too long.

exactly, imo its not a matter of if but when.

and imo if its finally revealed. some people will say “no shit”, some powerless people will be upset. but most people will say “i’m not doing anything wrong so i don’t care”.

and i’m willing to bet a bunch of the people currently telling us “its impossible”, will unironically switch overnight to saying “i always knew they were doing it and it never bothered me”

whogivesashit@lemmygrad.ml · 5 months ago

You make an awful lot of compelling points. I very much appreciate your analysis 😊 thank you

/home/pineapplelover@lemm.ee · 5 months ago

It’s actually really successful. I’ve had some conversations with people and right after, something on their Instagram feed would show something we just talked about. Most recently I made a joke to my friend about my name, next thing on his feed was a meme using my name.

SkingradGuard [he/him, comrade/them]@hexbear.net · 5 months ago

Bad headline then? Huh