⸻ Ban DHMO 🇦🇺 ⸻

  • 303 Posts
  • 1.05K Comments
Joined 1 year ago
cake
Cake day: June 13th, 2023

help-circle













  • > And the two sources you cite are overwhelmingly white male

    what sources??

    Edit: nevermind worked it out

    Yeah AI companies are being stupid using Reddit, Twitter and StackExchange, but there’s no magic fix to get more diverse AI. Would they need to pay people of underrepresented groups to write training material? Is that racist? Why should they get paid when everyone else is doing it for free?

    I don’t think they were being intentionally racist when they made the dictionaries for auto-correct - the article says it only accounts for around 41% of English names - but most modern auto-correct things on phones at least seem to add words to your dictionary if you use them a lot, and they’ve probably been limited by file size requirements in the past.


  • There’s a few issues with this article, namely auto-correct doesn’t use any AI as such (I believe, it’s a pretty old thing). I agree with the general argument that the name dictionaries of these auto-correct should be expanded to recognise more names, but I think labeling it “racist” is moronic. I also take issue with this weird dig:

    "The big problem is a lot of artificial intelligence scrapes the internet, whether it’s writing or music or blog sites or whatever, and so much of the stuff on the internet has been made by white men.

    "Effectively, AI is starting to mimic those white men, and you can see that AI on a number of platforms is increasingly becoming kind of racist and more sexist.

    I don’t think it’s “white men” that are the problem - it’s snarky redditors and stack exchange contributors. I’d might as well blame Italians for fascism or Muslims for 9/11.

    Like it or not AI needs training data - and if you want a more diverse training set get out there and make it for free like everyone else does. It’s a shame how sites like Twitter and Facebook have kind of killed personal blogs and promote shorter-form content which is harder to find, it’s amazing how much useful stuff I find comes from random old blogs.