The Spy Pixel problem

Boozilla@lemmy.one · 1 year ago

The Spy Pixel problem

TheCraiggers@lemmy.world · 1 year ago

I’m confused. How is this any different getting simply hosting a picture yourself and tracking all the IP addresses via http fetch logs? Why is Lemmy itself being singled out here? Why do you need some CGI script?

Boozilla@lemmy.one · 1 year ago

I am not a cybersecurity expert. And these are good questions. The problem is certainly not unique to Lemmy.

However, my (limited) understanding of it the opposing opinion is. 1. This is bad for privacy (marketers and other bad actors use these to track down your IP and other metadata) and 2. It should have been thought of before now and already had some protections put into place.

Teppic@kbin.social · 1 year ago

It is being discussed - here is a thread from yesterday:
https://kbin.social /m/support@lemmy.world/t/204434/Tracking-Lemmy-users-by-spy-tracker-pixels

And here is an ongoing discussion about a possible remedy:
https://github.com/LemmyNet/lemmy/pull/3550

But worth noting, unlike email the ‘view’ isn’t linked to an individual and an email address, and also broadcasting your IP address (yes and some meta data) as you browse isn’t unusual. Every page you visit could be doing this not just Lemmy.
Yes ideally this should be fixed, but in my view it is also a bit of a storm in a teacup.

Boozilla@lemmy.one · 1 year ago

Thank you, this is exactly the kind of info I was looking for. I figured someone was on top of this and the reddit dipstick was just being overly dramatic as usual.

MangoPenguin@lemmy.blahaj.zone · 1 year ago

There just isn’t any way to prevent a web server from logging IPs if the admin chooses to do so.

Boozilla@lemmy.one · 1 year ago

Right, but I think the difference here is lemmy allows users to embed these in their markdown text.

𝒍𝒆𝒎𝒂𝒏𝒏@lemmy.one · 1 year ago

Raddle user learns how the Internet works 🤯

In all seriousness though, although this is a concern, in Email in particular the solution most choose is to just disable images, so it isn’t really a sincere comparison IMO.

We could maybe mitigate this with…

Proxying & caching - Instance would cache a copy of the commented image and serve it from there, blocking the IP of the user from being exposed. This could introduce some additional latency and fill up server storage faster
CSP Header & Local caching - Client could block the name of the instance from being transferred, and also cache a copy of the image locally. This doesn’t protect the user’s IP address in any way, but would hinder the ability to count how many times a particular IP has viewed/accessed a post
Shared Lemmy image proxies - Image requests are proxied through a randomly selected Lemmy image proxy. This would ‘hide’ the origin IP to all but the volunteer proxy provider. I’d personally be willing to host a few of these if this ever became a thing

trent@kbin.social · 1 year ago

Why are people pretending this isn’t an issue??? Of course it is lol.
Luckily the fix is also easy: an image proxy server. Mail clients do this already.
It exposes the bigger problem with Lemmy: lack of auditing.

SkyNTP · 1 year ago

Nah, we’re auditing, just live.

For better or worse, security is in the community’s hands. But that’s why we are here in the first place.

phillipp@discuss.tchncs.de · edit-2 1 year ago

Can someone with more knowledge on the lemmy protocol/api bring some light into this? The way the linked posted is written, it seems like some random angry guy just hates lemmy for whatever reason.

To me it seems like a complete bs argument. As far as I can tell this tactic is possible with every service where users can provide content. Of course I can link to a site that reads users data. There’s basically no preventing this unless the (lemmy) clients provide their own modified browser that masks the users IP and other metadata.

raphael@kbin.mararead.com · edit-2 1 year ago

You actually can prevent this easily with CSP (content security policy). That header tells your browser which adresses it is allowed to load additional data from when visiting your site. It is an important tool to prevent cross-site scripting attacks, your browser should not load data from random sources when it is on your site.
Of course you would have to funnel all inline images through a site-local proxy that the browser is allowed to load data from.

This also has not only security implications, but also with the GDPR. Some jurisdiction consider ip addresses as personal data. Sending them to e.g. the US without user consent would be a violation. I know it is stupid to consider ip addresses as personal data and it is stupid to consider a browser loading data as sending that personal data somewhere on the sites’ behalf. But there is a reason why a lot of websites for example only embed tweets after you explicitely allow it.

Boozilla@lemmy.one · 1 year ago

When it comes to posting on lemmy I’d also consider bringing up that old bromide: don’t post anything you wouldn’t want your mother to see.

At least for now, anyway.

Boozilla@lemmy.one · 1 year ago

random angry guy just hates lemmy for whatever reason

There is definitely some of that at play here. I am hoping some smarter cybersec folks without the anti-lemmy-rage-bias can weigh in on t.

Dojan@lemmy.world · 1 year ago

I think when you link images off-site on Reddit, Reddit still caches a preview for it and serves that to the user, the user will actually have to click a link to go off the platform into the unknown. If we do embeds and such here they’re loaded from off site directly without user interaction.

Ergo your browser makes a request to a random potentially dangerous server, and there isn’t much the average user can do to prevent that.

CanOpener@sh.itjust.works · 1 year ago

This is a valid privacy issue, and other fediverse projects like Mastodon already solve this. The problem is that by embedding an image, you can tell the client to make a network request to your server, revealing information such as your IP address and browser. The solution is to proxy media through your instance, which is presumably trusted. this hides your IP address and browser information. And as someone else mentioned here, a Content-Security-Policy can be used to ensure this attack isn’t possible in a browser.

CanOpener@sh.itjust.works · 1 year ago

Any thoughts on how fixable this is?

This shouldn’t be hard to fix. Lemmy needs to proxy images, there’s an open issue for this. Right now, I don’t use Lemmy outside of Tor Browser specifically because of issues like this, and the recent XSS vulnerability is making me even more concerned. Lemmy is a great project, but it needs work and probably a security audit.

Boozilla@lemmy.one · 1 year ago

Appreciate the links. Thanks!

Maven (famous)@lemmy.world · edit-2 1 year ago

This is unrelated to the post itself but I really hope Lemmy and the fediverse as a whole don’t start using terms like FUD that originated with Crypto. Crypto went exceptionally badly and was wrought with scams and we should be doing as much as possible to distance ourself from that.

Edit: I’ll take my losses here. The term is much older than it’s recent use in popular culture despite my own lack of experience hearing it prior. I do however stand by wanting to distance from the cryptocurrency crowd as much as possible.

FutileRecipe@lemmy.world · 1 year ago

terms like FUD that originated with Crypto

Just because you first saw it in regards to crypto doesn’t mean that’s where it originated from.

https://wikipedia.org/wiki/Fear,_uncertainty,_and_doubt

Boozilla@lemmy.one · 1 year ago

Thanks. Not a fan of guilt by association of this type. The idea of FUD has been around for decades. It’s not inherently crypto or inherently anything. It’s just a useful acronym for a tactic some people use.

AfricanExpansionist · 1 year ago

FUD has been around way longer than Twitter

Maven (famous)@lemmy.world · 1 year ago

I meant specifically the shortened version FUD which became popular because of Bitcoin and Twitters limited letter count. Talking about fear uncertainty and doubt all together is definitely not a new idea.

Pat@kbin.run · 1 year ago

FUD has been a term since the 1970s. It has nothing to do with crypto

FutileRecipe@lemmy.world · 1 year ago

specifically the shortened version FUD

Here’s from the '70s:

“The search for self”. Clothes. New York, NY, USA: PRADS, Inc. 10 (14–24): 19. 1975-10-01. Retrieved 2011-06-10. […] One of the messages dealt with is FUD—the fear, uncertainty and doubt on the part of customer and sales person alike that stifles the approach and greeting. […]

sabreW4K3@lemmy.tf · 1 year ago

FUD was around from the Slashdot era.

abrasiveteapot@sh.itjust.works · edit-2 1 year ago

Lol. FUD predates the internet.

Someone else linked the wikipedia entry. First coined as an acronym when IBM was protecting its mainframe from competition by Amdahl butthre phrase goes back to the 20s

Edit

https://en.m.wikipedia.org/wiki/Fear,_uncertainty,_and_doubt

Gutless2615@ttrpg.network · 1 year ago

How young are you? FUD did not originate with Crypto. And crypto-means-cryptography-fight-me.

Doorbell0008@lemmy.world · 1 year ago

To be fair to FUD, it has its own much longer annoying history way before crypto.

scytale@lemmy.world · 1 year ago

Maybe Privacy Badger can get on this. I believe they block trackers like facebook by replacing widgets and other stuff that are embedded on pages. Not sure how they can do that for individual unknown trackers though.