Hi,
I’m interesting to see a list of blocked words. I tried to find infos about it, but my search wasn’t successful. I think it would be good to know what kind of words are not allowed, so that I can write in a way so that I don’t have to edit and rephrase my posts after writing them.
Thanks in advance!
The code is open source. If you have a look at it, there’s a nasty regular expression in it that matches in unexpected ways. E.g. the word t-w-a-t matches if you use the word sal-t-w-a-t-er. They may have fixed that word match by now but you get the idea. Someone who uses the word salt-w-a-ter would have no idea that that’s the offending word.
It’s a poor design for several reasons:
- It’s a regex with false positives
- It’s not coded in a way that makes it feasible to present in a readable way to normies in a policy statement
- The user is not told what word the site thinks is offensive
- It’s hard-coded, which means if someone else runs a Lemmy instance, the code is pre-packaged with policy. It’s a bad idea. Should be soft-coded in the config. It shouldn’t be bundled with the code.
- And look how easy it is to circumvent… i just had to add some dashes.
Thanks for your answer! I tried to search the Github, but that did not find anything. I now found it in this file: https://github.com/LemmyNet/lemmy/blob/master/server/lemmy_utils/src/lib.rs at line 259.
I absolutely agree on the hardcoded vs. config part. It is a relatively young software, I’m sure they’ll make it configurable. I’ve read somewhere that the devs won’t do that, but I guess that’s just ill-spirited fake news.
Also, yeah. It is easily circumventable. For example: Saltwater. I just made one of the letters cursive.
Also, yeah. It is easily circumventable.
Yes, easy to circumvent for bigots who actually use those words (b/c they likely know what word will be triggered and they probably have the energy to game it) – yet it’s difficult to legit users who might unwittingly write salt-w-a-ter or k-i-k-epa.
At least now the Lemmy server simply does a substitution. They used to block the ability to submit the article and you’d have to throw away your work. Substitution is still a problem though b/c legit users don’t always notice part of a word getting replaced with “removed” and then readers have to guess what word the author intended.
yet it’s difficult to circumvent for legit users
It is easy for everyone who knows how to throw of the slur filter.
k-i-k-epa
Huh… what’s wrong with that word? I have no idea that this is slur anywhere. It just seems to be a kind of clothing.
At least now they simply do a substitution. They used to block the ability to submit the article and you’d have to throw away your work.
That was way better. That way, the users were made aware that there is a problem, and it would be even better when the server replies by stating which words exactly were the offending ones. No harm to anyone by that, and you can just start to rephrase what you were saying, instead of not even realizing what happened.
Huh… what’s wrong with that word? I have no idea that this is slur anywhere. It just seems to be a kind of clothing.
Look at the first four letters. (I agree that this should be considered a bug in the filter)
Oh boy… I never heard about that one. This filter can drive you insane… you never know why it does what it does. It really should tell you what the offending part is.
Couldn’t someone just change the code in their Lemmy instance?
It is not fake news, it is unofficial position.
I don’t really read this as “we’ll never make it configurable”. It would be an easy fork anyway.
How I see it: Slur filters will not change the underlying problem. They just make the current contemporary selection of words invisible. Hate doesn’t come from words. Everyone can easily say the same thing with other words or phrases. You’d have to constantly update the list, leading to the constant invention of new words and phrases. A rat race.
It would be better to not let simple words be ones guide, but to actually listen/read and understand what is being said in the full context.
deleted by creator
deleted by creator