DarkPlayer@lemmy.world to Programming@programming.dev · 1 year agoUnicode tricks in pull requests: Do review tools warn us?semanticdiff.comexternal-linkmessage-square18fedilinkarrow-up186arrow-down15cross-posted to: security
arrow-up181arrow-down1external-linkUnicode tricks in pull requests: Do review tools warn us?semanticdiff.comDarkPlayer@lemmy.world to Programming@programming.dev · 1 year agomessage-square18fedilinkcross-posted to: security
minus-squaremonk@lemmy.unboiled.infolinkfedilinkarrow-up42·1 year agoHomoglyphs? Invisible text? Bidirectional text? Just highlight every line that goes beyond ASCII with yellow warning colors and require to vet it. Maybe make localization data an exception.
minus-squarecbarrick@lemmy.worldlinkfedilinkEnglisharrow-up12·1 year agoThis doesn’t work for code bases written in non-English languages. Especially east asian languages. Any line containing an identifier that is also a word would be highlighted. More and more programming languages are supporting unicode identifiers for this use case.
minus-squaremrkite@programming.devlinkfedilinkEnglisharrow-up11·1 year agoSo it won’t work for 0.0001% of all github projects.
minus-squaresndrtj@feddit.nllinkfedilinkarrow-up5·1 year agoI’d suggest to have the occasional look at the “most popular repos” ranking. It’s about 50% Chinese. Super-interesting sometimes as it shows completely different tech trends.
minus-squarecbarrick@lemmy.worldlinkfedilinkEnglisharrow-up1·1 year agoI know right. It’s wild that an American company primarily doing business in the West would have a bias towards English.
minus-squaremonk@lemmy.unboiled.infolinkfedilinkarrow-up6·1 year agoYeah, just don’t. Allowing to code in anything other than English is a disservice, plain and simple. Inb4, I’m not being US-centric, Latin ain’t even my native alphabet.
minus-squareActual@programming.devlinkfedilinkEnglisharrow-up6·1 year agoVery simple solution actually. Here I was thinking we’d need AI to solve it.
minus-squareDudeDudenson@lemmings.worldlinkfedilinkarrow-up16·1 year agoPeople would call that solution AI these days. If it has at least one if statement then they call it AI
minus-squarearthur@lemmy.ziplinkfedilinkEnglisharrow-up1·1 year agoOr the non-ascii character itself.
Homoglyphs? Invisible text? Bidirectional text? Just highlight every line that goes beyond ASCII with yellow warning colors and require to vet it. Maybe make localization data an exception.
This doesn’t work for code bases written in non-English languages. Especially east asian languages.
Any line containing an identifier that is also a word would be highlighted.
More and more programming languages are supporting unicode identifiers for this use case.
So it won’t work for 0.0001% of all github projects.
I’d suggest to have the occasional look at the “most popular repos” ranking. It’s about 50% Chinese.
Super-interesting sometimes as it shows completely different tech trends.
I know right.
It’s wild that an American company primarily doing business in the West would have a bias towards English.
Yeah, just don’t. Allowing to code in anything other than English is a disservice, plain and simple.
Inb4, I’m not being US-centric, Latin ain’t even my native alphabet.
deleted by creator
Very simple solution actually. Here I was thinking we’d need AI to solve it.
People would call that solution AI these days. If it has at least one if statement then they call it AI
We say we have AI to get VC funding
Or the non-ascii character itself.
Doesn’t work if it’s invisible.
what about a box around it?