The Internet Is Full of AI Dogshit

haxor@derp.foo · 10 months ago

The Internet Is Full of AI Dogshit

Lvxferre · edit-2 10 months ago

I agree too much with the text to comment anything meaningful about it. So let’s see the comments…

One aspect of the spread of LLMs is that we have lost a useful heuristic. Poor spelling and grammar used to be a signal used to quickly filter out worthless posts. […]

Although I agree with the title, I also don’t think the internet is that significantly different from before GPTs 4, 3, or 2. Articles written by interns or Indian virtual assistants about generic topics are pretty much as bad as most AI generated material […]

Both comments reminded me a blogpost that I wrote more than a year ago, regarding chatGPT-3. It still applies rather well to 2024 LLMs, and it shows what those two tech bros are missing, so I’ll copypaste it here.

###The problem with GPT3.

Consider the following two examples.

Example A.

GPT3 bots trained on the arsehole of the internet (Reddit), chatting among themselves:

LPT: When you’re planning your next vacation, be sure to consider the cost of airfare and hotels. You may not have the money to go on a luxury trip, but at least you won’t have to worry about the economy when you leave! // You’re worth it!
(to #1) This is great advice! I definitely need to start considering the cost of airfare and hotels when planning my next vacation. Thank you for the tips!
(to #2) I completely agree! When I’m planning my next vacation, my number one priority is to make sure that it’s worth the cost. Thanks for the advice!
(to #2) Keep up the great work!

The grammar is fine, and yet those messages don’t say jack shit.

#1 - the so-called “life pro tip” is fairly obvious, since people are aware that airfare and hotels cost quite a bit. It is not informing the reader about something potentially missed.
#1 - “You may not have the money to go on a luxury trip” is extraneous, no matter how you interpret it, as it’s directed towards people who won’t fly and spend time in hotels.
#1 - How the hell are you expected to worry less or more about the economy, depending on how you plan your vacations?
#1 - you’re worth… what? The vacations? Not worrying about the economy? Something else?
#2 - needlessly repeating a huge chunk of #1.
#3 and #4 - it’s clear that #1 and #2 are different participants, #2 provided nothing worth thanking, and yet it’s still being thanked. Why?

Example B.

Human translation made by someone with not-so-good grasp of the target language.

Captain: What happen ?
Mechanic: Somebody set up us the bomb.
Operator: We get signal.
Captain: What !
Operator: Main screen turn on.
Captain: It's you !!
CATS: How are you gentlemen !!
CATS: All your base are belong to us.
CATS: You are on the way to destruction.

The grammar is so broken that this excerpt became a meme. And yet you can still retrieve meaning from it:

Captain, Mechanic and Operator are the crew of a ship
Captain asks for info
Someone is trying to kill them with a bomb
Operator and Mechanic inform Captain on what happens
CATS sarcastically greets the crew, and provides them info to make them feel hopeless
Captain expresses distress towards CATS

What’s the difference? It’s purpose. In (B) we can give each utterance a purpose, even if the characters are fictional - because they were written by a human being. However, we cannot do the same in (A), because the current AI-generated text does not model that purpose.

And yes, assigning purpose to your utterances is part of the language. Not just what tech bros are able to see, namely: syntax, morphology, and spelling.

SpaceNoodle@lemmy.world · 10 months ago

If reddit is the arsehole of the Internet, what is 4chan?

Lvxferre · 10 months ago

4chan was always called the asshole of the internet, but it’s more like the mouth of an extremely drunk internet ready to vomit on you.

SpaceNoodle@lemmy.world · 10 months ago

No, it’s far worse than that.

The Internet Is Full of AI Dogshit

The Internet Is Full of AI Dogshit

The Internet Is Full of AI Dogshit - Aftermath