Unsurprisingly, some folks on raddle and reddit seem to have a big problem with lemmy. A lot of it is pure FUD.
However, this appears to be a valid security concern:
https://raddle.me/f/fediverse/166674/lemmy-is-so-much-like-email-it-even-brought-back-spy-tracker
Any thoughts on how fixable this is?
Of course the general consensus on reddit is “lemmy devs are clueless and dangerous”. I’m pretty sure a lot of it is one guy with multiple alt accounts, tho. He has a Joe McCarthy attitude about lemmy because of one of the primary devs.
I’m confused. How is this any different getting simply hosting a picture yourself and tracking all the IP addresses via http fetch logs? Why is Lemmy itself being singled out here? Why do you need some CGI script?
I am not a cybersecurity expert. And these are good questions. The problem is certainly not unique to Lemmy.
However, my (limited) understanding of it the opposing opinion is. 1. This is bad for privacy (marketers and other bad actors use these to track down your IP and other metadata) and 2. It should have been thought of before now and already had some protections put into place.
It is being discussed - here is a thread from yesterday:
https://kbin.social/m/support@lemmy.world/t/204434/Tracking-Lemmy-users-by-spy-tracker-pixelsAnd here is an ongoing discussion about a possible remedy:
https://github.com/LemmyNet/lemmy/pull/3550But worth noting, unlike email the ‘view’ isn’t linked to an individual and an email address, and also broadcasting your IP address (yes and some meta data) as you browse isn’t unusual. Every page you visit could be doing this not just Lemmy.
Yes ideally this should be fixed, but in my view it is also a bit of a storm in a teacup.Thank you, this is exactly the kind of info I was looking for. I figured someone was on top of this and the reddit dipstick was just being overly dramatic as usual.
There just isn’t any way to prevent a web server from logging IPs if the admin chooses to do so.
Right, but I think the difference here is lemmy allows users to embed these in their markdown text.
Raddle user learns how the Internet works 🤯
In all seriousness though, although this is a concern, in Email in particular the solution most choose is to just disable images, so it isn’t really a sincere comparison IMO.
We could maybe mitigate this with…
- Proxying & caching - Instance would cache a copy of the commented image and serve it from there, blocking the IP of the user from being exposed. This could introduce some additional latency and fill up server storage faster
- CSP Header & Local caching - Client could block the name of the instance from being transferred, and also cache a copy of the image locally. This doesn’t protect the user’s IP address in any way, but would hinder the ability to count how many times a particular IP has viewed/accessed a post
- Shared Lemmy image proxies - Image requests are proxied through a randomly selected Lemmy image proxy. This would ‘hide’ the origin IP to all but the volunteer proxy provider. I’d personally be willing to host a few of these if this ever became a thing
Why are people pretending this isn’t an issue??? Of course it is lol.
Luckily the fix is also easy: an image proxy server. Mail clients do this already.
It exposes the bigger problem with Lemmy: lack of auditing.Nah, we’re auditing, just live.
For better or worse, security is in the community’s hands. But that’s why we are here in the first place.
Can someone with more knowledge on the lemmy protocol/api bring some light into this? The way the linked posted is written, it seems like some random angry guy just hates lemmy for whatever reason.
To me it seems like a complete bs argument. As far as I can tell this tactic is possible with every service where users can provide content. Of course I can link to a site that reads users data. There’s basically no preventing this unless the (lemmy) clients provide their own modified browser that masks the users IP and other metadata.
You actually can prevent this easily with CSP (content security policy). That header tells your browser which adresses it is allowed to load additional data from when visiting your site. It is an important tool to prevent cross-site scripting attacks, your browser should not load data from random sources when it is on your site.
Of course you would have to funnel all inline images through a site-local proxy that the browser is allowed to load data from.This also has not only security implications, but also with the GDPR. Some jurisdiction consider ip addresses as personal data. Sending them to e.g. the US without user consent would be a violation. I know it is stupid to consider ip addresses as personal data and it is stupid to consider a browser loading data as sending that personal data somewhere on the sites’ behalf. But there is a reason why a lot of websites for example only embed tweets after you explicitely allow it.
When it comes to posting on lemmy I’d also consider bringing up that old bromide: don’t post anything you wouldn’t want your mother to see.
At least for now, anyway.
random angry guy just hates lemmy for whatever reason
There is definitely some of that at play here. I am hoping some smarter cybersec folks without the anti-lemmy-rage-bias can weigh in on t.
I think when you link images off-site on Reddit, Reddit still caches a preview for it and serves that to the user, the user will actually have to click a link to go off the platform into the unknown. If we do embeds and such here they’re loaded from off site directly without user interaction.
Ergo your browser makes a request to a random potentially dangerous server, and there isn’t much the average user can do to prevent that.
This is a valid privacy issue, and other fediverse projects like Mastodon already solve this. The problem is that by embedding an image, you can tell the client to make a network request to your server, revealing information such as your IP address and browser. The solution is to proxy media through your instance, which is presumably trusted. this hides your IP address and browser information. And as someone else mentioned here, a Content-Security-Policy can be used to ensure this attack isn’t possible in a browser.
Any thoughts on how fixable this is?
This shouldn’t be hard to fix. Lemmy needs to proxy images, there’s an open issue for this. Right now, I don’t use Lemmy outside of Tor Browser specifically because of issues like this, and the recent XSS vulnerability is making me even more concerned. Lemmy is a great project, but it needs work and probably a security audit.
Appreciate the links. Thanks!
This is unrelated to the post itself but I really hope Lemmy and the fediverse as a whole don’t start using terms like FUD that originated with Crypto. Crypto went exceptionally badly and was wrought with scams and we should be doing as much as possible to distance ourself from that.
Edit: I’ll take my losses here. The term is much older than it’s recent use in popular culture despite my own lack of experience hearing it prior. I do however stand by wanting to distance from the cryptocurrency crowd as much as possible.
terms like FUD that originated with Crypto
Just because you first saw it in regards to crypto doesn’t mean that’s where it originated from.
Thanks. Not a fan of guilt by association of this type. The idea of FUD has been around for decades. It’s not inherently crypto or inherently anything. It’s just a useful acronym for a tactic some people use.
FUD has been around way longer than Twitter
I meant specifically the shortened version FUD which became popular because of Bitcoin and Twitters limited letter count. Talking about fear uncertainty and doubt all together is definitely not a new idea.
FUD has been a term since the 1970s. It has nothing to do with crypto
specifically the shortened version FUD
Here’s from the '70s:
“The search for self”. Clothes. New York, NY, USA: PRADS, Inc. 10 (14–24): 19. 1975-10-01. Retrieved 2011-06-10. […] One of the messages dealt with is FUD—the fear, uncertainty and doubt on the part of customer and sales person alike that stifles the approach and greeting. […]
FUD was around from the Slashdot era.
Lol. FUD predates the internet.
Someone else linked the wikipedia entry. First coined as an acronym when IBM was protecting its mainframe from competition by Amdahl butthre phrase goes back to the 20s
Edit
https://en.m.wikipedia.org/wiki/Fear,_uncertainty,_and_doubt
How young are you? FUD did not originate with Crypto. And crypto-means-cryptography-fight-me.
To be fair to FUD, it has its own much longer annoying history way before crypto.
Maybe Privacy Badger can get on this. I believe they block trackers like facebook by replacing widgets and other stuff that are embedded on pages. Not sure how they can do that for individual unknown trackers though.